I was working on an example of how to make Docker images smaller for some documentation I'm writing, and it turned out to be such a great example that I decided I would actually write it up as a blog entry.
Every instruction in a Dockerfile will add a new layer to the resulting image. With some commands, there can be a lot of extra overhead that gets added into the image that doesn't necessarily provide real value. A great example of this is when you install software or build it from scratch. Suppose you run individual yum commands, for example:
FROM centos RUN yum -y update RUN yum -y install python RUN yum -y install ruby RUN yum -y install perl RUN yum -y clean
This will result in an image that is 668mb in size. While that may not seem like a lot, it can be a lot to stream over the wire. Instead, you can chain commands together which will end up only creating a single layer and will result in a smaller image:
FROM centos RUN yum -y update && yum -y install python && yum -y install ruby && yum -y install perl && yum -y clean all
The resulting image here is only 296mb. Still fairly large, but less than half the size. Docker version 1.13 also introduced the --squash switch (which requires that this feature be turned on in the daemon). This will optimize the built image by extracting all the layers and combining them into one final layer. There are github projects that do this as well.
Docker also recently introduced the multi-stage Dockerfile. This allows you to run a bunch of instructions in one container and then copy a result (like a binary that was compiled) into the final image. A multi-stage Dockerfile has multiple FROM instructions, one for each stage in the image build pipeline. For example, imagine you had a Go application (hello.go) that you wanted to build into an image. You could construct the Dockerfile like this:
FROM alpine:latest RUN apk update && apk add libc-dev && apk add go COPY hello.go /app/src/hello.go RUN go build -o /app/hello /app/src/hello.go CMD /app/hello
This, however, will create an image almost 300mb in size, even for the simplest Go script (the alpine image is only 4.15mb of that). Using a multistage Dockerfile, we could instead do the build like this:
FROM golang:1.7.3 WORKDIR /go/src/ COPY hello.go . RUN go build -o /go/hello /go/src/hello.go FROM alpine:latest COPY --from=0 /go/hello /app/hello CMD ["/app/hello"]
This results in an image that is only 5.8mb, about 50x smaller!