Define how little bits work together
Overview
This is a hands-on tutorial on how to create Dockerfile and docker-compose files that contain commands controlling how Docker instantiates Containers across several operating systems.
“Dockerizing” an application is the process of converting an application to run within a Docker Container and creating the Dockerfile for it.
Traditionally, to defined what processes ran on a machine, a build script or manual typing on a Terminal used the operating system’s CLI (Command Line Interface). Such scripts are stored in a particular Git repository within the GitHub version control system.
(I keep examples of Dockerfiles at
https://github.com/wilsonmar/Dockerfiles)
When we use Docker, instead of a shell script, we create a Dockerfile which specifies various layers of pre-built packages in DockerHub. A Docker image is a read-only template used to create and launch a Docker container.
There is a separate Dockerfile in each folder within a Git-controlled repository stored in a GitHub or other Version Control system so that entire sets of files can be retrieved from every point in time.
Dockerfiles specify images containing app assets which are pulled into Docker instances by the Docker Engine. On a Mac, the Container Engine runs within a Docker for Mac process.
Each image is built from a static snapshot of an Container.
Container orchestration utilities such as Kubernetes or Docker Compose make requests of Docker Engine through its API to automatically create additional pods as needed based on specification of a Helm or Compose file. Kubernetes can also remove pod instances when monitoring indicates that less are needed.
Docker images and containers are a key building block for the Service Mesh architecture which has an Envoy component in each pod to handle communication and security certificates.
TL;DR; Docker Security
This article describes how to Dockerize apps in a “hardened” way. That means:
- Include as little as possible in container images. This makes them both quicker to load and also more secure. Thus, we prefer Alpin OS rather than open-source Ubuntu which includes software we don’t need nor want.
- Run rootless containers.
- Crypographically sign each image created.
- Use verified images.
- Use access control on registries rather than allowing anyone to use them.
- Use Kubernetes for implementing RBAC (Role-Based Access Control)
- Run containers on isolated networks (by adding AWS Security Groups)
Dockerize apps
Let’s begin with an example.
-
Create or navigate to the folder containing a Dockerfile (or should contain one).
A folder is needed because each Dockerfile must be named “Dockerfile”.
-
View the Dockerfile:
cat Dockerfile
Alternately, you may prefer to open the file using a text editor or IDE.
There are only a handful of instructions (verbs) in a Dockerfile.
For example:
FROM node:0.10.44-slim COPY . /home/demo/box/ RUN cd /home/demo/box && npm install ENTRYPOINT ["/home/demo/box/boot.sh"]
REMEMBER: Commands cannot change the ENTRYPOINT value.
Docker builder instructions
# (“pound sign”) begins a comments line or a directive.
- FROM must be the first line. It sets the image to an operating system image. For options, do a docker search.
- MAINTAINER Wilson Mar <wilsonmar@gmail.com> # defines the file’s author
- USER
- ARG user1=someuser # referenced by –build-arg user=what_user in docker build
- ARG CONT_IMG_VER
- ENV CONT_IMG_VER ${CONT_IMG_VER:-v1.0.0}
- ENV def=$abc
- ENV foo /bar
- WORKDIR ${foo} # sets working directory to /bar
- VOLUME /tmp
- COPY $foo /quux
-
ADD . $foo # where $foo is a .tar compressed file which ADD automatically unzips
- HEALTHCHECK–interval=5m –timeout=3s <br /> CMD curl -f http://localhost/ || exit 1
- CMD [”–port 27017”] # provides defaults to executing container
- CMD [“/usr/bin/wc”,”–help”] # executable and parameter
-
EXPOSE 27017 # sets the port to listen
- RUN bash -c ‘touch /app.jar’ # updates the repository sources list, etc.
- ENTRYPOINT [“top”, “-b”] # sets default container commands
- ONBUILD RUN /usr/local/bin/python-build –dir /app/src
- STOPSIGNAL SIGKILL # sets the system call signal that will be sent to the container to exit.
See https://docs.docker.com/engine/reference/builder
More examples at https://docs.docker.com/engine/examples
PROTIP: A chmod or chown changes a timestamp on the file even when there is no permission or ownership change made. Each dd command adds a 1MB layer. Thus, each chmod command changes permissions and causes a copy of the entire 1MB file to the next layer.
PROTIP: Reduce the image size by merging RUN lines so they work on a single layer:
FROM busybox RUN mkdir /data \ && dd if=/dev/zero bs=1024 count=1024 of=/data/one \ && chmod -R 0777 /data \ && dd if=/dev/zero bs=1024 count=1024 of=/data/two \ && chmod -R 0777 /data \ && rm /data/one CMD ls -alh /data
To handle UID/GID and permission issues, update image to match host uid/gid:
FROM debian:latest ARG UID=1000 ARG GID=1000 RUN groupadd -g $GID cuser \ && useradd -m -u $UID -g $GID -s /bin/bash cuser USER cuser
Then:
$ docker build \ --build-arg UID=$(id -u) --build-arg GID=$(id -g) .
This Dockerfile shows use of the ENTRYPOINT to run Apache in the foreground (i.e., as PID 1):
FROM debian:stable RUN apt-get update && apt-get install -y --force-yes apache2 EXPOSE 80 443 VOLUME ["/var/www", "/var/log/apache2", "/etc/apache2"] ENTRYPOINT ["/usr/sbin/apache2ctl", "-D", "FOREGROUND"]
FROM jenkins/jenkins:lts USER root RUN apt-get update \ && wget -O /usr/local/bin/gosu "https://github.com/..." \ && chmod +x /usr/local/bin/gosu \ && curl -sSL https://get.docker.com/ | sh \ && usermod -aG docker jenkins COPY entrypoint.sh /entrypoint.sh ENTRYPOINT ["/entrypoint.sh"]
An example for Java on WebLogic:
FROM kmandel/java:8 VOLUME /tmp #ADD ${project.build.final}.jar app.jar ADD my-api.jar app.jar RUN bash -c 'touch /app.jar' ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar","/app.jar"]
After a Dockerfile is prepared, execute from command prompt to create the corresponding image:
docker build .
docker build -t username/imagename:tagname
Docker run
Run docker run image-name to create a container out of the image to execute it.
See https://github.com/sudo-bmitch/dc2018
Dockerizing programming code
One of the advantages of using Docker is that an application can be deployed on several operating systems. But different operating systems have different ways of specifying file paths such as:
APP_CONFIG=/etc/dev.config
Such files would contain API keys and flags to vary app behavior without requiring a re-deploy.
PROTIP: Apps in Docker should be written in a way that references a file external to itself to obtain configuration data such as API keys.
Contents in configuration files can be varied at run-time by a script that mounts different volumes containing the config file or by using a sed command which find a unique pattern in the file, then modifies the data.
Common Logging
Also, rather than writing event information to a custom database, “cloud native” application programming code print to STDOUT/STDERR. This ensures application logs have a common format so that logs from other apps and monitoring utilities can all be co-mingled in a central logging system for historical analysis together by timeline.
Logs can be acessed directly with the
docker logs
command and by Docker API calls.
To simplify the dockerization process, some use the Dockerize utility Jason Wilder wrote in Golang and describes here. It works by wrapping calls to apps using the ENTRYPOINT or CMD directives.
.dockerignore
The .dockerignore file is like a .gitignore file, but specifies items for Docker to ignore in the Dockerfile.
See https://docs.docker.com/engine/reference/builder/#/dockerignore-file
Mount
mount a local path and map it to a path within the container
~/Source/projecta:/usr/src/app
Java
The JVM historically looked in /proc to figure out how much memory was available so it can set its heap size based on that value.
However, containers like Docker don’t provide container specific information in /proc because it’s a priviledged folder, like /sys.
And JVM was written before Docker switches (-m, –memory and –memory-swap) and the Kubernetes switch (–limits) which instruct the Linux kernel to kill the process (as an OOM (Out of Memory) error) if it tries to exceed the limit specified. When -m 150M is specified in the Docker command line, the docker daemon will limit 150M in RAM and 150M in Swap. As a result, the process can allocate the 300M and it explains why our process didn’t receive any kill from the Kernel.
So Christine Flood proposed a JVM command line argument which tells the JVM ergonomics to look in /sys/fs/cgroup/memory/memory.limit_in_bytes to figure out how much memory is available:
-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap
These are added as the $JAVA_OPTIONS environment variable included in a Docker command such as:
CMD java -XX:+PrintFlagsFinal -XX:+PrintGCDetails $JAVA_OPTIONS -jar java-container.jar
When running Docker:
docker run -d --name mycontainer8g -p 8080:8080 -m 800M -e JAVA_OPTIONS='-Xmx300m' rafabene/java-container:openjdk-env
docker logs mycontainer8g | grep -i MaxHeapSize |
If this patch isn’t available in the OpenJDK version you are running you can simulate it by setting -XX:MaxRAM=n explicitly.
Java 10 has all the improvements needed to run inside a container.
But those staying with JDK 8u131+ and JDK 9 need to specify an experimental VM option that allows the JVM ergonomics to read the memory values from CGgroups:
docker run -it –name mycontainer -p 8080:8080 -m 600M rafabene/java-container:openjdk10
One way to solve this problem is using the Fabric8 Base image that is capable of understanding that it is running inside a restricted container and it will automatically adjust the maximum heap size if you haven’t done it yourself.
http://rafabene.com/2017/07/07/java-inside-docker/
Docker Compose
Most apps are database-driven, so we introduce a separate service for a database layer with its own data volume (storage space).
The docker-compose.yml file contains instructions to stitch multiple pieces together such as database container, application container, host folder where you store your application repository, environmental aspects such as volumes, and ports.
An example docker-compose-dev.yml file from here defining services:
version: '2' services: web: image: node:6.1 volumes: - ./:/usr/src/app working_dir: /usr/src/app command: sh -c 'npm install; npm install -g nodemon ; nodemon -e js,jade app.js' ports: - "80:3000" depends_on: - mongo networks: - all environment: MONGODB_URI: "mongodb://mongo:27017/hackathon" mongo: image: mongo:3 command: mongod --smallfiles networks: - all networks: all:
For the version, see https://docs.docker.com/compose
Another one:
version: '3.2' volumes: postgres-data: services: db: image: postgres volumes: - postgres-data:/var/lib/postgresql/data app: build: context: . dockerfile: Dockerfile command: bundle exec rails s -p 3000 -b '0.0.0.0' volumes: - .:/project ports: - "3000:3000" depends_on: - db
The depends_on: specifies the launch of “db” before the app service.
-
Define attributes of Docker host in environment variables:
- DOCKER_HOST
- DOCKER_TLS_VERIFY
- DOCKER_CERT_PATH
See https://docs.docker.com/v1.11/compose/compose-file/
More resources
This tutorial is based on these and other resources:
-
https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/#user details ENTRYPOINT
-
http://thediscoblog.com/blog/2014/05/05/dockerfiles-in-a-jiffy/
-
https://github.com/prakhar1989/docker-curriculum by prakhar1989, who was propelled to #18 on GitHub due largely to this tutorial.
-
https://deis.com/blog/2015/dockerfile-instructions-syntax/
-
https://runnable.com/docker/java/dockerize-your-java-application
https://www.udemy.com/zero-to-docker/learn/v4/t/lecture/7270460?start=0
https://github.com/schoolofdevops/voting-app-worker
https://schoolofdevops.com
https://hub.docker.com/u/schoolofdevops/
More on DevOps
This is one of a series on DevOps:
- DevOps_2.0
- ci-cd (Continuous Integration and Continuous Delivery)
- User Stories for DevOps
- Git and GitHub vs File Archival
- Git Commands and Statuses
- Git Commit, Tag, Push
- Git Utilities
- Data Security GitHub
- GitHub API
- Choices for DevOps Technologies
- Pulumi Infrastructure as Code (IaC)
- Java DevOps Workflow
- AWS DevOps (CodeCommit, CodePipeline, CodeDeploy)
- AWS server deployment options
- Cloud services comparisons (across vendors)
- Cloud regions (across vendors)
- Azure Cloud Onramp (Subscriptions, Portal GUI, CLI)
- Azure Certifications
- Azure Cloud Powershell
- Bash Windows using Microsoft’s WSL (Windows Subsystem for Linux)
- Azure Networking
- Azure Storage
- Azure Compute
- Digital Ocean
- Packer automation to build Vagrant images
- Terraform multi-cloud provisioning automation
-
Hashicorp Vault and Consul to generate and hold secrets
- Powershell Ecosystem
- Powershell on MacOS
- Jenkins Server Setup
- Jenkins Plug-ins
- Jenkins Freestyle jobs
- Docker (Glossary, Ecosystem, Certification)
- Make Makefile for Docker
- Docker Setup and run Bash shell script
- Bash coding
- Docker Setup
- Dockerize apps
- Ansible
- Kubernetes Operators
- Threat Modeling
- API Management Microsoft
- Scenarios for load
- Chaos Engineering