Archive

Posts Tagged ‘docker’
  • A Dockerfile for Maven-based Github projects

    Docker logo

    :page-liquid: :icons: font :experimental:

    Since the Docker “revolution”, I’ve been interested in creating a Dockefile for Spring applications. I’m far from an ardent practitioner, but the principle of creating the Dockerfile is dead simple. As in Java - or probably any programming language, however, while it’s easy to achieve something that works, it’s much harder to create something that works well.

    == Multi-stage builds

    In Docker, one of the main issue is the size of the final image. It’s not uncommon to end up with images over 1 GB even for simple Java applications. Since version 17.05 of Docker, it’s possible to have multiple builds in a single Dockerfile, and to access the output the previous build into the current one. Those are called multi-stage builds. The final image will be based on the last build stage.

    Let’s imagine the code is hosted on Github, and that it’s based on Maven. Build stages would be as follow:

    1. Clone the code from Github
    2. Copy the folder from the previous stage; build the app with Maven
    3. Copy the JAR from the previous stage; run it with java -jar

    Here is a build file to start from:

    [source]

    FROM alpine/git WORKDIR /app RUN git clone https://github.com/spring-projects/spring-petclinic.git <1>

    FROM maven:3.5-jdk-8-alpine WORKDIR /app COPY –from=0 /app/spring-petclinic /app <2> RUN mvn install <3>

    FROM openjdk:8-jre-alpine WORKDIR /app COPY –from=1 /app/target/spring-petclinic-1.5.1.jar /app <4> CMD [“java -jar spring-petclinic-1.5.1.jar”] <5> —-

    It maps the above build stages:

    <1> Clone the Spring PetClinic git repository from Github <2> Copy the project folder from the previous build stage <3> Build the app <4> Copy the JAR from the previous build stage <5> Run the app

    == Improving readability

    Notice that in the previous build file, build stages are referenced via their index (starting from 0) e.g. COPY --from=0. While not a real issue, it’s always better to have something semantically meaningful. Docker allows to label stages and references those labels in later stages.

    [source]

    FROM alpine/git as clone <1> WORKDIR /app RUN git clone https://github.com/spring-projects/spring-petclinic.git

    FROM maven:3.5-jdk-8-alpine as build <2> WORKDIR /app COPY –from=clone /app/spring-petclinic /app <3> RUN mvn install

    FROM openjdk:8-jre-alpine WORKDIR /app COPY –from=build /app/target/spring-petclinic-1.5.1.jar /app CMD [“java -jar spring-petclinic-1.5.1.jar”] —-

    <1> Labels the first stage as clone <2> Labels the second stage as build <3> References the first stage using the label

    == Choosing the right image(s)

    Multiple-stages build help tremendously with image size management, but it’s not the only criterion. The image to start with has a big impact on the final image. Beginners generally use full-blown Operating Systems images, such as https://hub.docker.com/_/ubuntu/[Ubuntu^], inflating the size of the final image for no good reason. Yet, there are lightweight OS, that are very well suited to Docker images, such as https://alpinelinux.org/[Alpine Linux^].

    NOTE: It’s also a great fit for security purposes, as the attack surface is limited.

    In the above build file, images are the following:

    1. https://hub.docker.com/r/alpine/git/[alpine/git^]
    2. https://hub.docker.com/_/maven/[maven:3.5-jdk-8-alpine^]
    3. https://hub.docker.com/_/openjdk/[openjdk:8-jre-alpine]

    They all inherit transitively or directly from https://hub.docker.com/_/alpine/[alpine].

    Image sizes are as follow:

    [source]

    REPOSITORY TAG IMAGE ID CREATED SIZE nfrankel/spring-petclinic latest 293bd333d60c 10 days ago 117MB openjdk 8-jre-alpine c4f9d77cd2a1 4 weeks ago 81.4MB —-

    The difference in size between the JRE and the app images is around 36 MB which is the size of the JAR itself.

    == Exposing the port

    The Spring Pet Clinic is a webapp, so it requires to expose the HTTP port it will bind to. The relevant Docker directive is EXPOSE. I choose 8080 as a port number to be the same as the embedded Tomcat container, but it could be anything. The last stage should be modified as such:

    [source]

    FROM openjdk:8-jre-alpine WORKDIR /app COPY –from=build /app/target/spring-petclinic-1.5.1.jar /app EXPOSE 8080 CMD [“java -jar spring-petclinic-1.5.1.jar”] —-

    == Parameterization

    At this point, it appears the build file can be used for building any webapp with the following features:

    1. The source code is hosted on Github
    2. The build tool is Maven
    3. The resulting output is an executable JAR file

    Of course, that suits Spring Boot applications very well, but this is not a hard requirement.

    Parameters include:

    • The Github repository URL
    • The project name
    • Maven’s artifactId and version
    • The artefact name (as it might differ from the artifactId, depending on the specific Maven configuration)

    Let’s use those to design a parameterized build file. In Docker, parameters can be passed using either ENV or ARG options. Both are set using the --build-arg option on the command-line. Differences are the following:

    [cols=”3*”] |===

    h Type
      ENV
      ARG
    h Found in the image
      icon:check-square-o[]
      icon:square-o[]
    h Default value
      Required
      Optional

    |===

    [source]

    FROM alpine/git as clone ARG url <1> WORKDIR /app RUN git clone ${url} <2>

    FROM maven:3.5-jdk-8-alpine as build ARG project <3> WORKDIR /app COPY –from=clone /app/${project} /app RUN mvn install

    FROM openjdk:8-jre-alpine ARG artifactid ARG version ENV artifact ${artifactid}-${version}.jar <4> WORKDIR /app COPY –from=build /app/target/${artifact} /app EXPOSE 8080 CMD [“java -jar ${artifact}”] <5> —-

    <1> url must be passed on the command line to set which Github repo to clone <2> url is replaced by the passed value <3> Same as <1> <4> artifact must be an ENV, so as to be persisted in the final app image <5> Use the artifact value at runtime

    == Building

    The Spring Pet Clinic image can now be built using the following command-line:

    [source]

    docker build –build-arg url=https://github.com/spring-projects/spring-petclinic.git\ –build-arg project=spring-petclinic\ –build-arg artifactid=spring-petclinic\ –build-arg version=1.5.1\ -t nfrankel/spring-petclinic - < Dockerfile —-

    NOTE: Since the image doesn’t depend on the filesystem, no context needs to be passed and the Dockerfile can be piped from the standard input.

    To build another app, parameters can be changed accordingly e.g.:

    [source]

    docker build –build-arg url=https://github.com/heroku/java-getting-started.git\ –build-arg project=java-getting-started\ –build-arg artifactid=java-getting-started\ –build-arg version=1.0\ -t nfrankel/java-getting-started - < Dockerfile —-

    == Running

    Running an image built with the above command is quite easy:

    [source]

    docker run -ti -p8080:8080 nfrankel/spring-petclinic —-

    Unfortunately, it fails with following error message: starting container process caused "exec: \"java -jar ${artifact}\": executable file not found in $PATH. The trick is to use the ENTRYPOINT directive. The updated Dockerfile looks as per the following:

    [source]

    FROM openjdk:8-jre-alpine ARG artifactid ARG version ENV artifact ${artifactid}-${version}.jar <4> WORKDIR /app COPY –from=build /app/target/${artifact} /app EXPOSE 8080 ENTRYPOINT [“sh”, “-c”] CMD [“java -jar ${artifact}”] <5> —-

    As this point, running the container will (finally) work.

    NOTE: the second image uses a different port, so the command should be: docker run -ti -p8080:5000 nfrankel/java-getting-started.

    == Food for thoughts

    External Git:: The current build clones a repository, and hence doesn’t need sending the context to Docker. An alternative would be to clone outside the build file, e.g. in a continuous integration chain, and start from the context. That could be useful to build the image during development on developers machines. Latest version or not:: For the Git image, I used the latest version, while for the JDK and JRE images, I used a specific major version. It’s important for the Java version to be fixed to a major version, not so much for Git. It depends on the nature of the image. Building from master:: There’s no configuration to change branch after cloning. This is wrong, as most of the times, builds are executed on a dedicated tag e.g. v1.0.0. Obviously, there should be an additional command - as well as an additional build argument, to checkout a specific tag before the build. Skipping tests:: It takes an awful time for the Pet Clinic application to build, as it has a huge test harness. Executing those tests take time, and Maven must go through the test phase to reach the install phase. Depending on the specific continuous integration chain, tests might have been executed earlier and mustn’t be executed again during the image build. Maven repository download:: Spring Boot apps have a lot of dependencies. It takes a long time for the image to build, as all dependencies need to be downloaded every time for every app. There are a couple of solutions to handle that. It probably will be the subject of another post.

    Categories: Development Tags: dockercontainerbuild
  • From Vaadin to Docker, a novice's journey

    I’m a huge Vaadin fan and I’ve created a Github workshop I can demo at conferences. A common issue with such kind of workshops is that attendees have to prepare their workstations in advance… and there’s always a significant part of them that comes with not everything ready. At this point, two options are available to the speaker: either wait for each of the attendee to finish the preparation - too bad for the people who took the time at home to do that, or start anyway - and lose the not-ready part.

    Given the current buzz around Docker, I thought that could be a very good way to make the workshop preparation quicker - only one step, and hasslefree - no problem regarding the quirks of your operation system. The required steps I ask the attendees are the following:

    1. Install Git
    2. Install Java, Maven and Tomcat
    3. Clone the git repo
    4. Build the project (to prepare the Maven repository)
    5. Deploy the built webapp
    6. Start Tomcat

    These should directly be automated into Docker. As I wasted much time getting this to work, here’s the tale of my journey in achieving this (be warned, it’s quite long). If you’ve got similar use-cases, I hope it will be useful in you getting things done faster.

    Starting with Docker

    The first step was to get to know the basics about Docker. Fortunately, I had the chance to attend a Docker workshop by David Gageot at Duchess Swiss. This included both Docker installation and basics of Dockerfile. I assume readers have likewise a basic understanding of Docker.

    For those who don’t, I guess browsing the Docker’s official documentation is a nice idea:

    Building my first Dockerfile

    The Docker image can be built with the following command ran into the directory of the Dockerfile:

    $ docker build -t vaadinworkshop .
    

    The first issues one can encounter when playing with Docker the first time, is to get the following error message:

    Get http:///var/run/docker.sock/v1.14/containers/json: dial unix /var/run/docker.sock: no such file or directory
    

    The reason is because one didn’t export the required environment variables displayed by the boot2docker information message. If you lost the exact data, no worry, just use the shellinit boot2docker parameter:

    $ boot2docker shellinit
    
    Writing /Users/i303869/.docker/boot2docker-vm/ca.pem:
    Writing /Users/i303869/.docker/boot2docker-vm/cert.pem:
    Writing /Users/i303869/.docker/boot2docker-vm/key.pem:
        export DOCKER_HOST=tcp://192.168.59.103:2376
        export DOCKER_CERT_PATH=/Users/i303869/.docker/boot2docker-vm
    

    Copy-paste the export lines above will solve the issue. These can also be set in one’s .bashrc script as it seems these values seldom change.

    Next in line is the following error:

    Get http://192.168.59.103:2376/v1.14/containers/json: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
    

    This error message seems to be because of a mismatch between versions of the client and the server. It seems it is because of a bug on Mac OSX when upgrading. For a long term solution, reinstall Docker from scratch; for a quick fix, use the --tls flag with the docker command. As it is quite cumbersome to type it everything, one can alias it:

    $ alias docker="docker --tls"
    

    My last mistake when building the image comes from building the Dockerfile from a not empty directory. Docker sends every file it finds in the directory of the Dockerfile to the Docker container for build:

    $ docker --tls build -t vaadinworkshop .
    
    Sending build context to Docker daemon Too many kB
    

    Fix: do not try this at home and start from a directory container the Dockerfile only.

    Starting from scratch

    Dockerfiles describe images - images are built as a layered list of instructions. Docker images are designed around single inheritance: one image has to be set a single parent. An image requiring no parent starts from scratch, but Docker provides 4 base official distributions: busybox, debian, ubuntu and centos (operating systems are generally a good start).

    Whatever you want to achieve, it is necessary to choose the right parent. Given the requirements I set for myself (Java, Maven, Tomcat and Git), I tried to find the right starting image. Many Dockerfiles are already available online on the Docker hub. The browsing app is quite good, but to be really honest, the search can really be improved.

    My intention was to use the image that matched the most of my requirements, then fill the gap. I could find no image providing Git, but I thought the dgageot/maven Dockerfile would be a nice starting point. The problem is that the base image is a busybox and provides no installer out-of-the-box (apt-get, yum, whatever). For this reason, David uses a lot of curl to get Java 8 and Maven in his Dockerfiles.

    I foolishly thought I could use a different flavor of busybox that provides the opkg installer. After a while, I accumulated many problems, resolving one heading to another. In the end, I finally decided to use the OS I was most comfortable with and to install everything myself:

    FROM ubuntu:utopic
    

    Scripting Java installation

    Installing git, maven and tomcat packages is very straightforward (if you don’t forget to use the non-interactive options) with RUN and apt-get:

    RUN apt-get update && \\
        apt-get install -y --force-yes git maven tomcat8
    

    Java doesn’t fall into this nice pattern, as Oracle wants you to accept the license. Nice people did however publish it to a third-party repo. Steps are the following:

    1. Add the needed package repository
    2. Configure the system to automatically accept the license
    3. Configure the system to add un-certified packages
    4. Update the list of repositories
    5. At last, install the package
    6. Also add a package for Java 8 system configuration
    RUN echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu precise main" | tee -a /etc/apt/sources.list && \\
        echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections && \\
        apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886
    
    RUN apt-get update && \\
        apt-get install -y --force-yes oracle-java8-installer oracle-java8-set-default
    

    Building the sources

    Getting the workshop’s sources and building them is quite straightforward with the following instructions:

    RUN git clone  https://github.com/nfrankel/vaadin7-workshop.git
    WORKDIR /vaadin7-workshop
    RUN mvn package
    

    The drawback of this approach is that Maven will start from a fresh repository, and thus download the Internet the first time it is launched. At first, I wanted to mount a volume from the host to the container to share the ~/.m2/repository folder to avoid this, but I noticed this could only be done at runtime through the -v option as the VOLUME instruction cannot point to a host directory.

    Starting the image

    The simplest command to start the created Docker image is the following:

    $ docker run -p 8080:8080
    

    Do not forget the port forwarding from the container to the host, 8080 for the standard HTTP port. Also, note that it’s not necessary to run the container as a daemon (with the -d option). The added value of that is that the standard output of the CMD (see below) will be redirected to the host. When running as a daemon and wanting to check the logs, one has to execute bash in the container, which requires a sequence of cumbersome manipulations.

    Configuring and launching Tomcat

    Tomcat can be launched when starting the container by just adding the following instruction to the Dockerfile:

    CMD ["catalina.sh", "run"]
    

    However, trying to start the container at this point will result in the following error:

    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/common/classes], exists: [false], isDirectory: [false], canRead: [false]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/common], exists: [false], isDirectory: [false], canRead: [false]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/server/classes], exists: [false], isDirectory: [false], canRead: [false]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/server], exists: [false], isDirectory: [false], canRead: [false
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/shared/classes], exists: [false], isDirectory: [false], canRead: [false]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.ClassLoaderFactory validateFile
    WARNING: Problem with directory [/usr/share/tomcat8/shared], exists: [false], isDirectory: [false], canRead: [false]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.Catalina initDirs
    SEVERE: Cannot find specified temporary folder at /usr/share/tomcat8/temp
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.Catalina load
    WARNING: Unable to load server configuration from [/usr/share/tomcat8/conf/server.xml]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.Catalina initDirs
    SEVERE: Cannot find specified temporary folder at /usr/share/tomcat8/temp
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.Catalina load
    WARNING: Unable to load server configuration from [/usr/share/tomcat8/conf/server.xml]
    Nov 15, 2014 9:24:18 PM org.apache.catalina.startup.Catalina start
    SEVERE: Cannot start server. Server instance is not configured.
    

    I have no idea why, but it seems Tomcat 8 on Ubuntu is not configured in any meaningful way. Everything is available but we need some symbolic links here and there as well as creating the temp directory. This translates into the following instruction in the Dockerfile:

    RUN ln -s /var/lib/tomcat8/common $CATALINA_HOME/common && \\
        ln -s /var/lib/tomcat8/server $CATALINA_HOME/server && \\
        ln -s /var/lib/tomcat8/shared $CATALINA_HOME/shared && \\
        ln -s /etc/tomcat8 $CATALINA_HOME/conf && \\
        mkdir $CATALINA_HOME/temp
    

    The final trick is to connect the exploded webapp folder created by Maven to Tomcat’s webapps folder, which it looks for deployments:

    RUN mkdir $CATALINA_HOME/webapps && \\
        ln -s /vaadin7-workshop/target/workshop-7.2-1.0-SNAPSHOT/ $CATALINA_HOME/webapps/vaadinworkshop
    

    At this point, the Holy Grail is not far away, you just have to browse the URL… if only we knew what the IP was. Since running on Mac, there’s an additional VM beside the host and the container that’s involved. To get this IP, type:

    $ boot2docker ip
    The VM's Host only interface IP address is: 192.168.59.103
    
    

    Now, browsing http://192.168.59.103:8080/vaadinworkshop/ will bring us to the familiar workshop screen:

    Developing from there

    Everything works fine but didn’t we just forget about one important thing, like how workshop attendees are supposed to work on the sources? Easy enough, just mount the volume when starting the container:

    docker run -v /Users/<login>/vaadin7-workshop:/vaadin7-workshop  -p 8080:8080 vaadinworkshop
    

    Note that the host volume must be part of /Users and if on OSX, it must use boot2docker v. 1.3+.

    Unfortunately, it seems now is the showstopper, as mounting an empty directory from the host to the container will not make the container’s directory available from the host. On the contrary, it will empty the container’s directory given that the host’s directory doesn’t exist… It seems there’s an issue in Docker on Mac. The installation of JHipster runs into the same problem, and proposes to use the Samba Docker folder sharing project.

    I’m afraid I was too lazy to go further at this point. However, this taught me much about Docker, its usages and use-cases (as well as OSX integration limitations). For those who are interested, you’ll find below the Docker file. Happy Docker!

    FROM ubuntu:utopic
    MAINTAINER Nicolas Frankel <nicolas [at] frankel (dot) ch>
    
    # Config to get to install Java 8 w/o interaction
    RUN echo "deb http://ppa.launchpad.net/webupd8team/java/ubuntu precise main" | tee -a /etc/apt/sources.list && \
    echo oracle-java8-installer shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections && \
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys EEA14886
    RUN apt-get update && \
    apt-get install -y --force-yes git oracle-java8-installer oracle-java8-set-default maven tomcat8
    RUN git clone https://github.com/nfrankel/vaadin7-workshop.git
    WORKDIR /vaadin7-workshop
    RUN git checkout v7.2-1
    RUN mvn package
    ENV JAVA_HOME /usr/lib/jvm/java-8-oracle
    ENV CATALINA_HOME /usr/share/tomcat8
    ENV PATH $PATH:$CATALINA_HOME/bin
    
    # Configure Tomcat 8 directories
    RUN ln -s /var/lib/tomcat8/common $CATALINA_HOME/common && \
    ln -s /var/lib/tomcat8/server $CATALINA_HOME/server && \
    ln -s /var/lib/tomcat8/shared $CATALINA_HOME/shared && \
    ln -s /etc/tomcat8 $CATALINA_HOME/conf && \
    mkdir $CATALINA_HOME/temp && \
    mkdir $CATALINA_HOME/webapps && \
    ln -s /vaadin7-workshop/target/workshop-7.2-1.0-SNAPSHOT/ $CATALINA_HOME/webapps/vaadinworkshop
    VOLUME ["/vaadin7-workshop"]
    CMD ["catalina.sh", "run"]
    
    # docker build -t vaadinworkshop .
    # docker run -v ~/vaadin7-workshop training/webapp -p 8080:8080 vaadinworkshop
    
    Categories: Development Tags: dockervaadin