/ DOCKER, OPTIMIZATION, GEMFILE, NPM

Layering your Docker images with dependencies

Since already some years, this site uses the Jekyll static site generator. Generating the complete site takes about 15 minutes. Some time ago, I decided to setup Gulp to minimize HTML in order to speed up page view for readers. This change added about 10 minutes to the build time, for a total of ~25 minutes, which is a long time…​ too long. I wanted to decrease the build time, so I decided to play around.

The initial state

A Docker image is used to build the site. The image is based on the JRuby image: Ruby is the basic technology stack for Jekyll, and Java is necessary for the PlantUML plugin for AsciiDoctor.

In the image, the following stacks are set up:

FROM jruby:9

RUN gem install bundler \                                       (1)
    && curl -sL https://deb.nodesource.com/setup_9.x | bash - \ (2)
    && apt-get install -y nodejs graphviz \                     (3)
    && npm install -g gulp-cli                                  (4)
1 Install Bundler
2 Add the Node.js repository to APT
3 Install Node.js, and Graphviz, which is a PlantUML dependency
4 Install the Gulp command-line interface

This image is uploaded to a private Docker registry, to be used by the site generation.

The GitLab build file installs both the required Gems and NPM packages. It looks like the following:

image: registry.gitlab.com/nfrankel/nfrankel.gitlab.io:latest

before_script:
  - bundle install                        (1)
  - npm set progress=false && npm install (2)

pages:
  stage: deploy
  script: gulp
  only:
    - master
1 Install Gems from the Gemfile
2 Install NPM packages from the package.json

The issue there is that every time the site is built, Gems and NPM packages need to be installed. This is an issue, because even though the site is built at least once a week, dependencies are pretty stable. And yet, the installation cost is to be paid during every build.

Layering all the things!

It seems like a good idea to move the above dependencies from the build file to a Docker layer.

The first step is to move the previous image to another location, e.g. registry.gitlab.com/nfrankel/nfrankel.gitlab.io/base:latest.

The second step it to create a second Dockerfile in the project folder:

FROM registry.gitlab.com/nfrankel/nfrankel.gitlab.io/base:latest

ADD Gemfile /builds/nfrankel/nfrankel.gitlab.io/Gemfile                     (1)
ADD Gemfile.lock /builds/nfrankel/nfrankel.gitlab.io/Gemfile.lock           (1)
ADD package.json /builds/nfrankel/nfrankel.gitlab.io/package.json           (1)
ADD package-lock.json /builds/nfrankel/nfrankel.gitlab.io/package-lock.json (1)
RUN cd /builds/nfrankel/nfrankel.gitlab.io \
    && bundle install \                                                     (2)
    && npm set progress=false \
    && npm install                                                          (3)
1 Add necessary files to the context
2 Install Gems
3 Install NPM packages

Experienced Docker users probably realized that all the folder’s content will be sent to the daemon when the image is built. However, only the 4 above files are necessary.

To prevent that, the third step is to create a .dockerignore file to send only those files:

**                 (1)
!Dockerfile        (2)
!Gemfile           (3)
!Gemfile.lock      (3)
!package.json      (3)
!package-lock.json (3)
1 Ignore every file
2 Add the Dockerfile itself
3 Add required files

The dependencies' image can now be built:

docker build -t registry.gitlab.com/nfrankel/nfrankel.gitlab.io/withdeps:latest .

Finally, let’s update the GitLab build file:

image: registry.gitlab.com/nfrankel/nfrankel.gitlab.io/withdeps:latest (1)

                                                                       (2)
pages:
  stage: deploy
  script: gulp
  only:
    - master
1 Reference the Docker image with the dependencies
2 Remove the dependencies getting steps

Conclusion

First things first: the new build didn’t provide any noticeable speed improvement…​ The main reason is that most of the build time is spent in minimizing the HTML, not fetching the dependencies. On the good side, the build log file is much shorter, and easier to analyze.

Of course, this process is non only Jekyll-specific, it can be done with Maven builds as well.

Nicolas Fränkel

Nicolas Fränkel

Nicolas Fränkel is a Developer Advocate with 15+ years experience consulting for many different customers, in a wide range of contexts (such as telecoms, banking, insurances, large retail and public sector). Usually working on Java/Java EE and Spring technologies, but with focused interests like Rich Internet Applications, Testing, CI/CD and DevOps. Currently working for Hazelcast. Also double as a teacher in universities and higher education schools, a trainer and triples as a book author.

Read More