• Synthetic

    Plastic straw tubes

    There is a bunch of languages running on the JVM, from of course Java, to Clojure and JRuby. All of them have different syntaxes, but it’s awesome they all compile to the same bytecode. The JVM unites them all. Of course, it’s biased toward Java, but even in Java, there is some magic happening in the bytecode.

    The most well-known trick comes from the following code:

    public class Foo {
        static class Bar {
            private Bar() {}
        }
    
        public static void main(String... args) {
            new Bar();
        }
    }

    Can you guess how many constructors the Bar class has?

    Two. Yes, you read that well. For the JVM, the Bar class declares 2 constructors. Run the following code if you don’t believe it:

    Class<Foo.Bar> clazz = Foo.Bar.class;
    Constructor<?>[] constructors = clazz.getDeclaredConstructors();
    System.out.println(constructors.length);
    Arrays.stream(constructors).forEach(constructor -> {
        System.out.println("Constructor: " + constructor);
    });

    The output is the following:

    Constructor: private Foo$Bar()
    Constructor: Foo$Bar(Foo$1)

    The reason is pretty well documented. The bytecode knows about access modifiers, but not about nested classes. In order for the Foo class to be able to create new Bar instances, the Java compiler generates an additional constructor with a default package visibility.

    This can be confirmed with the javap tool.

    javap -v out/production/synthetic/Foo\$Bar.class

    This outputs the following:

    [...]
    {
      Foo$Bar(Foo$1);
        descriptor: (LFoo$1;)V
        flags: ACC_SYNTHETIC
        Code:
          stack=1, locals=2, args_size=2
             0: aload_0
             1: invokespecial #1                  // Method "<init>":()V
             4: return
          LineNumberTable:
            line 2: 0
          LocalVariableTable:
            Start  Length  Slot  Name   Signature
                0       5     0  this   LFoo$Bar;
                0       5     1    x0   LFoo$1;
    }
    [...]

    Notice the ACC_SYNTHETIC flag. Going to the JVM specifications yields the following information:

    The ACC_SYNTHETIC flag indicates that this method was generated by a compiler and does not appear in source code, unless it is one of the methods named in §4.7.8.
    — The class File Format
    https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.6

    Theoretically, it should be possible to call this generated constructor - notwithstanding the fact that it’s not possible to provide an instance of Foo$1, but let’s put it aside. But the IDE doesn’t seem to be able to discover this second non-argumentless constructor. I didn’t find any reference in the Java Language Specification, but synthetic classes and members cannot be accessed directly but only through reflection.

    At this point, one could wonder why all the fuss about the synthetic flag. It was introduced in Java to resolve the issue of nested classes access. But other JVM languages use it to implement their specification. For example, Kotlin uses synthetic to access the companion object:

    class Baz() {
        companion object {
            val BAZ = "baz"
        }
    }

    Executing javap on the .class file returns the following output (abridged for readability purpose) :

    {
      public static final Baz$Companion Companion;
        descriptor: LBaz$Companion;
        flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL
    
      public Baz();
      [...]
    
      public static final java.lang.String access$getBAZ$cp();
        descriptor: ()Ljava/lang/String;
        flags: ACC_PUBLIC, ACC_STATIC, ACC_FINAL, ACC_SYNTHETIC
        Code:
          stack=1, locals=0, args_size=0
             0: getstatic     #22                 // Field BAZ:Ljava/lang/String;
             3: areturn
          LineNumberTable:
            line 1: 0
        RuntimeInvisibleAnnotations:
          0: #15()
    }
    [...]

    Notice the access$getBAZ$cp() static method? That’s the name of the method that should be called from Java:

    public class FromJava {
    
        public static void main(String... args) {
            Baz.Companion.getBAZ();
        }
    }

    Conclusion

    While knowledge of the synthetic flag is not required in the day-to-day work of a JVM developer, it can be helpful to understand some of the results returned by the reflection API.

    Categories: Java Tags: JVMbytecodejavapKotlin
  • Flavors of Spring application context configuration

    Spring framework logo

    Every now and then, there’s an angry post or comment bitching about how the Spring framework is full of XML, how terrible and verbose it is, and how the author would never use it because of that. Of course, that is completely crap. First, when Spring was created, XML was pretty hot. J2EE deployment descriptors (yes, that was the name at the time) was XML-based.

    Anyway, it’s 2017 folks, and there are multiple ways to skin a cat. This article aims at listing the different ways a Spring application context can be configured so as to enlighten the aforementioned crowd - and stop the trolling around Spring and XML.

    XML

    XLM has been the first way to configure the Spring application context. Basically, one create an XML file with a dedicated namespace. It’s very straightforward:

    <beans xmlns="http://www.springframework.org/schema/beans"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.springframework.org/schema/beans 
               http://www.springframework.org/schema/beans/spring-beans.xsd">
        <bean id="foo" class="ch.frankel.blog.Foo">
            <constructor-arg value="Hello world!" />
        </bean>
        <bean id="bar" class="ch.frankel.blog.Bar">
            <constructor-arg ref="bar" />
        </bean>
    </beans>
    

    The next step is to create the application context, using dedicated classe:

    ApplicationContext ctx = new ClassPathXmlApplicationContext("ch/frankel/blog/context.xml");
    ApplicationContext ctx = new FileSystemXmlApplicationContext("/opt/app/context.xml");
    ApplicationContext ctx = new GenericXmlApplicationContext("classpath:ch/frankel/blog/context.xml");
    

    XML’s declarative nature enforces simplicity at the cost of extra verbosity. It’s orthogonal to the code - it’s completely independent. Before the coming of JavaConfig, I still favored XML over self-annotated classes.

    Self-annotated classes

    As for every new future/technology, when Java 5 introduced annotations, there was a rush to use them. In essence, a self-annotated class will be auto-magically registered into the application context.

    To achieve that, Spring provides the @Component annotation. However, to improve semantics, there are also dedicated annotations to differentiate between the 3 standard layers of the layered architecture principle:

    • @Controller
    • @Service
    • @Repository

    This is also quite straightforward:

    @Component
    public class Foo {
    
        public Foo(@Value("Hello world!") String value) { }
    }
    
    @Component
    public class Bar {
    
        @Autowired
        public Bar(Foo foo) { }
    }
    

    To scan for self-annotated classes, a dedicated application context is necessary:

    ApplicationContext ctx = new AnnotationConfigApplicationContext("ch.frankel.blog");
    

    Self-annotated classes are quite easy to use, but there are some downsides:

    • A self-annotated class becomes dependent on the Spring framework. For a framework based on dependency injection, that’s quite a problem.
    • Usage of self-annotations blurs the boundary between the class and the bean. As a consequence, the class cannot be registered multiple times, under different names and scopes into the context.
    • Self-annotated classes require autowiring, which has downsides on its own.

    Java configuration

    Given the above problems regarding self-annotated classes, the Spring framework introduced a new way to configure the context: JavaConfig. In essence, JavaConfig configuration classes replace XML file, but with compile-time safety instead of XML-schema runtime validation. This is based on two annotations @Configuration for classes, and @Bean for methods.

    The equivalent of the above XML is the following snippet:

    @Configuration
    public class JavaConfiguration {
    
        @Bean
        public Foo foo() {
            return new Foo("Hello world!");
        }
    
        @Bean
        public Bar bar() {
            return new Bar(foo());
        }
    }
    

    JavaConfig classes can be scanned like self-annotated classes:

    ApplicationContext ctx = new AnnotationConfigApplicationContext("ch.frankel.blog");
    

    JavaConfig is the way to configure Spring application: it’s orthogonal to the code, and brings some degree of compile-time validation.

    Groovy DSL

    Spring 4 added a way to configure the context via a Groovy Domain-Specific Language. The configuration takes place in a Groovy file, with the beans element as its roots.

    beans {
        foo String, 'Hello world!'
        bar Bar, foo
    }
    

    There’s an associated application context creator class:

    ApplicationContext ctx = new GenericGroovyApplicationContext("ch/frankel/blog/context.groovy");
    

    I’m not a Groovy developer, so I never used that option. But if you are, it makes a lot of sense.

    Kotlin DSL

    Groovy has been unceremoniously kicked out of the Pivotal portfolio some time ago. There is no correlation, Kotlin has found its way in. It’s no wonder that the the upcoming release of Spring 5 provides a Kotlin DSL.

    package ch.frankel.blog
    
    fun beans() = beans {
        bean {
            Foo("Hello world!")
            Bar(ref())
        }
    }
    

    Note that while bean declaration is explicit, wiring is implicit, as in JavaConfig @Bean methods with dependencies.

    In opposition to configuration flavors mentioned above, the Kotlin DSL needs an existing context to register beans in:

    import ch.frankel.blog.beans
    
    fun register(ctx: GenericApplicationContext) {
        beans().invoke(ctx)
    }
    

    I didn’t use Kotlin DSL but to play a bit with it for a demo, so I cannot say for sure about pros/cons.

    Conclusion

    So far, the JavaConfig alternative is my favorite: it’s orthogonal to the code and provides some degree of compile-time validation. As a Kotlin enthusiast, I’m also quite eager to try the Kotlin DSL in large projects to experience its pros and cons first-hand.

  • On exceptions

    When Java came out some decades ago, it was pretty innovative at the time. In particular, its exception handling mechanism was a great improvement over previous C/C++. For example, in order to read from the file, there could be a lot of exceptions happening: the file can be absent, it can be read-only, etc.

    The associated Java-like pseudo-code would be akin to:

    File file = new File("/path");
    
    if (!file.exists) {
        System.out.println("File doesn't exist");
    } else if (!file.canRead()) {
        System.out.println("File cannot be read");
    } else {
        // Finally read the file
        // Depending on the language
        // This could span seveal lines
    }
    

    The idea behind separated try catch blocks was to separate between business code and exception-handling code.

    try {
        File file = new File("/path");
        // Finally read the file
        // Depending on the language
        // This could span seveal lines
    } catch (FileNotFoundException e) {
        System.out.println("File doesn't exist");
    } catch (FileNotReadableException e) {
        System.out.println("File cannot be read");
    }
    

    Of course, the above code is useless standalone. It probably makes up the body of a dedicated method for reading files.

    public String readFile(String path) {
        if (!file.exists) {
            return null;
        } else if (!file.canRead()) {
            return null;
        } else {
            // Finally read the file
            // Depending on the language
            // This could span seveal lines
            return content;
        }
    }
    

    One of the problem with the above catch blocks is that they return null. Hence:

    1. Calling code needs to check everytime for null values
    2. There’s no way to know whether the file was not found, or if it was not readable.

    Using a more functional approach fixes the first issue and hence allows methods to be composed.:

    public Optional<String> readFile(String path) {
        if (!file.exists) {
            return Optional.empty();
        } else if (!file.canRead()) {
            return Optional.empty();
        } else {
            // Finally read the file
            // Depending on the language
            // This could span seveal lines
            return Optional.of(content);
        }
    }
    

    Sadly, it changes nothing about the second problem. Followers of a purely functional approach would probably discard the previous snippet in favor of something like that:

    public Either<String, Failure> readFile(String path) {
        if (!file.exists) {
            return Either.right(new FileNotFoundFailure(path));
        } else if (!file.canRead()) {
            return Either.right(new FileNotReadableFailure(path));
        } else {
            // Finally read the file
            // Depending on the language
            // This could span seveal lines
            return Either.left(content);
        }
    }
    

    And presto, there’s a nice improvement over the previous code. It’s now more meaningful, as it tells exactly why it failed (if it does), thanks to the right part of the return value.

    Unfortunately, one issue remains, and not a small one. What about the calling code? It would need to handle the failure. Or more probably, let the calling code handle it, and so on, and so forth, up to the topmost code. For me, that makes it impossible to consider the exception-free functional approach an improvement.

    For example, this is what happens in Go:

        items, err := todo.ReadItems(file)
        if err != nil {
            fmt.Errorf("%v", err)
        }
    

    This is pretty fine if the code ends here. But otherwise, err has to be passed to the calling code, and all the way up, as described above. Of course, there’s the panic keyword, but it seems it’s not the preferred way to handle exceptions.

    The strangest part is this is exactly what people complain about with Java’s checked exceptions: it’s necessary to handle them at the exact location where they appear and the method signature has to be changed accordingly.

    For this reason, I’m all in favor of unchecked exceptions. The only downside of those is that they break purely functional programming - throwing exceptions is considered a side-effect. Unless you’re working with a purely functional approach, there is no incentive to avoid unchecked exceptions.

    Moreover, languages and frameworks may provide hooks to handle exceptions at the topmost level. For example, on the JVM, they include:

    • In the JDK, Thread.setDefaultUncaughtExceptionHandler()
    • In Vaadin, VaadinSession.setErrorHandler()
    • In Spring MVC, @ExceptionHandler
    • etc.

    This way, you can let your exceptions bubble up to the place they can be handled in the way they should. Embrace (unchecked) exceptions!

    Categories: Development Tags: software design
  • You're not a compiler!

    Passed exam

    At conferences, it’s common to get gifts at booths. In general, you need to pass some some sort of challenge.

    An innocent puzzle

    Some of those challenges involve answering a code puzzle, like the following:

    What would be the output of running the following snippet:

    public class Sum {
    
        public static void main(String[] args) {
            System.out.println(0123 + 3210);
        }
    }
    1. 3293
    2. 3333
    3. The code doesn’t compile
    4. The code throws an IllegalArgumentException
    5. None of the above

    My usual reaction is either to leave the booth immediately, or if I’ve some time, to write the code down on my laptop to get the result (or search on Google). You could say I’m lazy but I prefer to think I’m an expert at optimizing how I spend my time.

    Simple but useless

    There is a single advantage to this approach: anybody can grade the answers - if the correct results are provided. And that’s all.

    Otherwise, this kind of puzzles brings absolutely nothing to the table. This wouldn’t be an issue if it stayed on conference booths, as a "fun" game during breakouts. Unfortunately, the stuff has become pretty ubiquitous, to filter out candidates during recruiting or to deliver certifications.

    In both cases, skills needs to be assessed, and the easy solution is through a multiple-choice question test. Of course, if one has no access to an IDE nor Google, then it becomes a real challenge. But in the case of the above sample, what’s being checked? The knowledge of a very specific corner-case. This specific usage I would disallow on my projects anyway for this exact same reason: it’s a corner-case! And I want my code to be as free of corner cases as possible,

    Another sample

    This is also another kind of puzzle you can stumble upon:

    In the class javax.servlet.http.HttpServlet, what’s the correct signature of the service() method?:

    1. public void service(ServletRequest, ServletResponse)
    2. public void service(HttpServletRequest, HttpServletResponse)
    3. public int service(ServletRequest, ServletResponse)
    4. public int service(HttpServletRequest, HttpServletResponse)
    5. protected void service(ServletRequest, ServletResponse)
    6. protected void service(HttpServletRequest, HttpServletResponse)
    7. protected int service(ServletRequest, ServletResponse)
    8. protected int service(HttpServletRequest, HttpServletResponse)

    This kind of question asserts nothing but the capacity to remember something that is given by the most basic of IDEs or the JavaDoc.

    In fact, the complete title of this post should have been "You’re not a compiler, nor a runtime, nor the documentation".

    The core issue

    There is no single proof of correlation between correctly answering any of the above questions and real-world proficiency of applying the skills they are meant to assess. Is there an alternative? Yes. Don’t simply check shallow knowledge, check what you want to assert, under the same conditions, including IDEs, documentation, Google, etc.

    • You want to check if candidates are able to read code? Make them read your own code - or similar code if your code is secret.
    • You want to check if candidates are able to write code? Make them write code e.g. fix an existing or already-fixed bug.

    The same can be done in the context of certifications.

    Of course, the downside is that it takes time. Time to prepare the material. Time to analyze what the candidate produced. Time to debrief the candidate.

    Is this feasible? Unfortunately for those who would cast that aside as an utopia, it already exists. The best example is the Java EE Enterprise Architect. Though the first step is a multiple-choice questions test, other steps include a project-like assignment and an associated essay.

    Another example is much more personal. I’m working as a part-time lecturer public in higher education organizations. In some courses, I evaluate students by giving them an assignment: develop a very small application. The assignment takes the form of specifications, as a business analyst would.

    Conclusion

    For the sake of our industry, let’s stop pretending puzzles are worth anything beyond having some fun with colleagues. You should be more concerned about the consequences of a bad hire than the about the time spent on assessing the skills of a potential hire. There are alternatives, use them. Or face the consequences.

    Categories: Miscellaneous Tags: hiringrecruitingcertification
  • Rise and fall of JVM languages

    Rising trend on a barchart

    Every now and then, there’s a post predicting the death of the Java language. The funny thing is that none of them writes about a date. But to be honest, they are all probably true. This is the fate of every language: to disappear into oblivion - or more precisely to be used less and less for new projects. The question is what will replace them?

    Last week saw another such article on InfoQ. At least, this one told about a possible replacement, Kotlin. It got me thinking about the state of the JVM languages, and trends. Note that trends have nothing to do with the technical merits and flaws of each language.

    I started developing in Java, late 2001. At that time, Java was really cool. Every young developer wanted to work on so-called new technologies: either .Net or Java, as older developers were stuck on Cobol. I had studied C and C++ in school, and memory management in Java was so much easier. I was happy with Java…​ but not everyone was.

    Groovy came into existence in 2003. I don’t remember at what time I learned about it. I just ignored it: I had no need of a scripting language then. In the context of developing enterprise-grade applications with a long lifespan with a team of many developers, static typing was a huge advantage over dynamic typing. Having to create tests to check the type system was a net loss. The only time I had to create scripts, it was as a WebSphere administrator: the choice was between Python and TCL.

    Scala was incepted one year later in 2004. I don’t remember when and how I heard about it, but it was much later. But in opposition to Groovy, I decided to give it a try. The main reason was my long interest in creating "better" code - read more readable and maintainable. Scala being statically typed, it was more what I was looking for. I followed the Coursera course Principles of Functional Programming in Scala. It had three main consequences:

    • It questioned my way to write Java code. For example, why did I automatically generate getters and setters when designing a class?
    • I decided Scala made it too easy to write code unreadable for most developers - including myself
    • I started looking for other alternative languages

    After Groovy and Scala came the second generation (3rd if you count Java as the first) of JVM languages, including:

    After a casual glance at them, I became convinced they had not much traction, and were not worth investing my time.

    Some years ago, I decided to teach myself basic Android to be able to understand the development context of mobile developers. Oh boy! After years of developing Java EE and Spring applications, that was a surprise - and not a pleasant one. It was like being sent backwards a decade in the past. The Android API is so low level…​ not to even mention testing the app locally. After a quick search around, I found that Kotlin was mentioned in a lot of places, and finally decided to give it a try. I fell in love immediately: with Kotlin, I could improve the existing crap API into something better, even elegant, thanks to extension functions. I dug more into the language, and started using Kotlin for server-side projects as well. Then, the Spring framework announced its integration of Kotlin. And at Google I/O, Google announced its support of Kotlin on Android.

    This post is about my own personal experience and opinion formed from it. If you prefer the comfort of reading only posts you agree with, feel free to stop reading a this point.

    Apart from my own experience, what is the current state of those languages? I ran a quick search on Google Trends.

    Overview of JVM languages on Google Trends

    There are a couple of interesting things to note:

    • Google has recognized search terms i.e. "Programming Language" for Scala, Groovy and Kotlin, but not for Ceylon and eXtend. For Ceylon, I can only assume it’s because Ceylon is a popular location. For eXtend, I’m afraid there are just not enough Google searches.
    • Scala is by far the most popular, followed by Groovy and Kotlin. I have no real clue about the scale.
    • The Kotlin peak in May is correlated with Google’s support announcement at Google I/O.
    • Most searches for Scala and Kotlin originate from China, Groovy is much more balanced regarding locations.
    • Scala searches strongly correlate with the term "Spark", Kotlin searches with the term "Android".

    Digging a bit further may uncover interesting facts:

    • xTend is not dead, because it has never been living. Never read any post about it. Never listend to a conference talk neither.
    • In 2017, Red Hat gave Ceylon to the Eclipse Foundation, creating Eclipse Ceylon. A private actor giving away software to a foundation might be interpreted in different ways. In this case and despite all reassuring talks around the move, this is not a good sign for the future of Ceylon.
    • In 2015, Pivotal stopped sponsoring Groovy and it moved to the Apache foundation. While I believe Groovy has a wide enough support base, and a unique niche on the JVM - scripting, it’s also not a good sign. This correlates to the commit frequency of the core Groovy committers: their number of commits drastically decreases - to the point of stopping for some.
    • Interestingly enough, both Scala and Kotlin recently invaded other spaces, transpiling to JavaScript and compiling to native.
    • In Java, JEP 286 is a proposal to enhance the language with type inference, a feature already provided by Scala and Kotlin. It’s however limited to only local variables.
    • Interestingly enough, there are efforts to improve Scala compilation time by keeping only a subset of the language. Which then raises the question, why keep Scala if you ditch its powerful features (such as macros)?

    I’m not great at forecast, but here are my 2 cents:

    1. Groovy has its own niche - scripting, which leaves Java, Scala and Kotlin vying for the pure application development space on the server-side JVM.
    2. Scala also carved its own space. Scala developers generally consider the language superior to Java (or Kotlin) and won’t migrate to another one. However, due to Spring and Google announcements, Kotlin may replace Scala as the language developers go to when they are dissatisfied with Java.
    3. Kotlin has won the Android battle. Scala ignored this area in the past, and won’t invest in it in the future given Kotlin is so far ahead in the game.
    4. Kotlin’s rise on the mobile was not an intended move, but rather a nice and unexpected surprise. But JetBrains used it as a way forward as soon as they noticed the trend.
    5. Kotlin interoperability with Java is the killer feature that may convince managers to migrate legacy projects to or start new projects with Kotlin. Just as non-breaking backward compatibility was for Java.

    I’d be very much interested in your opinion, dear readers, even (especially?) if you disagree with the above. Just please be courteous and provide facts when you can to prove your points.

    Categories: Miscellaneous Tags: trend analysisKotlinxTendScalaGroovyCeylon
  • Automating generation of Asciidoctor output

    Asciidoctor logo

    I’ve been asked to design new courses for the next semester at my local university. Historically, I create course slides with Power Point and lab instructions with Word. There are other possible alternatives, for slides and documents:

    I also recently wrote some lab documents with Asciidoctor, versioned with Git. The build process generates HTML pages and those are hosted on GitHub pages. The build is triggered at every push.

    This solution has several advantages for me:

    • I’m in full control of the media. Asciidoctor files are kept on my hard drive, and any other Git repository I want, public or private, in the cloud or self-hosted.
    • The Asciidoctor format is really great for writing documentation.
    • With one source document, there can be multiple output formats (.e.g. HTML, PDF, etc.).
    • Hosted on Github/GitLab Pages, HTML generation can be automated for every push on the remote.

    Starting point

    Content should be released under the Creative Commons license. That means there’s no requirement for a private repository. In addition, a public repository will allows students to make Pull Requests. Both Github and GitLab provide free page hosting. Since developers are in general more familiar with Github than with GitLab, the former will be the platform of choice.

    Content is made from courses and labs:

    • Courses should be displayed as slides
    • Labs as standard HTML documents

    Setup

    Setup differs slightly for courses and labs.

    Courses

    To display slides, let’s use Reveal.js. Fortunately, Asciidoctor Reveal.js allows to generate slides from Asciidoctor sources. With Ruby, the setup is pretty straightforward, thanks to the documentation:

    1. Install bundler:
      gem install bundler
    2. Create a Gemfile with the following content:
      source 'https://rubygems.org'
      
      gem 'asciidoctor-revealjs'
    3. Configure bundler:
      bundle config --local github.https true
      bundle --path=.bundle/gems --binstubs=.bundle/.bin
    4. Finally, generate HTML:
      bundle exec asciidoctor-revealjs source.adoc

    Actually, the generation command is slightly different in my case:

    bundle exec asciidoctor-revealjs -D output/cours\ (1)
                                     -a revealjs_history=true\ (2)
                                     -a revealjs_theme=white (3)
                                     -a revealjs_slideNumber=true\ (4)
                                     -a linkcss\ (5)
                                     -a customcss=../style.css\ (6)
                                     -a revealjsdir=https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.5.0 (7)
                                     cours/*.adoc (8)
    1 Set the HTML output directory
    2 Explicitly push the URL into the browser history at each navigation step
    3 Use the white them (by default, it’s black)
    4 Display the slide number - it’s just mandatory for reference purpose
    5 Create an external CSS file instead of embedding the styles into each document - makes sense if there are more than one document
    6 Override some styles
    7 Reference the JavaScript framework to use - could be a local location
    8 Generate all documents belonging to a folder - instead of just one specific document

    Labs

    To generate standard HTML documents from Asciidoctor is even easier:

    bundle exec asciidoctor -D output/tp tp/*.adoc

    Automated remote generation

    As a software developer, it’s our sacred duty to automate as much as possible. In this context, it means generating HTML pages from Asciidoctor sources on each push to the remote repository.

    While Github doesn’t provide any build tool, it integrates greatly with Travis CI. I’ve already written about the process to publish HTML on Github Pages. The only differences in the build file come from:

    1. The project structure
    2. The above setup

    Automated local generation

    So far, so good. Everything works fine, every push to the remote generates the site. The only thing drawback, is that in order to preview the site, one has either to push…​ or to type the whole command-line every time. The bash history helps somewhat, until some other command is required.

    As a user, I want to preview the HTML automatically in order to keep me from writing the same command-line over and over.

    — Me

    This one a bit is harder, because it’s not related to Asciidoctor. A full-blown solution with LiveReload and everything would be probably be overkill (read that I’m too lazy). But I’ll be happy enough with a watch over the local file system, and the trigger of a command when changes are detected. As my laptop runs on OSX, here’s a solution that "works on my machine". This is based on the launchd process and the plist format.

    This format is XML-based, but "weakly-typed" and based on the order of elements. For example, key value pairs are defined as such:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
      <key>key1</key>
      <string>value1</string>
      <key>key2</key>
      <string>value2</string>
    </dict>

    A couple of things I had to find out on my own - I had no prior experience with launchd:

    1. A .plist file should be named as per the key named Label.
    2. It should be located in the ~/Library/LaunchAgents/ folder.
    3. In this specific case, the most important part is to understand how to watch the filesystem. It’s achieved with the WatchPaths key associated with the array of paths to watch.
    4. The second most important part is the command to execute, ProgramArguments. The syntax is quite convoluted: every argument on the command line (as separated by spaces) should be an element in an array.
      It seems the $PATH is not initialized with environment variables of my own user, so the full path to the executable should be used.
    5. As debugging is mandatory - at least with the first few runs, feel free to use StandardErrorPath and StandardOutPath to respectively write standard err and out in files.

    The final .plist looks something like that:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    <plist version="1.0">
    <dict>
      <key>Label</key>
      <string>ch.frankel.generatejaveecours.plist</string>
      <key>WorkingDirectory</key>
      <string>/Users/frankel/projects/course/javaee</string>
      <key>Program</key>
      <string>/usr/local/bin/bundle</string>
      <key>ProgramArguments</key>
      <array>
        <string>/usr/local/bin/bundle</string>
        <string>exec</string>
        <string>asciidoctor-revealjs</string>
        <string>-a</string>
        <string>revealjs_history=true</string>
        <string>-a</string>
        <string>revealjs_theme=white</string>
        <string>-a</string>
        <string>revealjs_slideNumber=true</string>
        <string>-a</string>
        <string>linkcss</string>
        <string>-a</string>
        <string>customcss=../style.css</string>
        <string>-a</string>
        <string>revealjsdir=https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.5.0</string>
        <string>cours/*.adoc</string>
      </array>
      <key>WatchPaths</key>
      <array>
        <string>/Users/frankel/projects/course/javaee/cours</string>
      </array>
      <key>StandardErrorPath</key>
      <string>/tmp/asciidocgenerate.err</string>
      <key>StandardOutPath</key>
      <string>/tmp/asciidocgenerate.out</string>
    </dict>
    </plist>

    To finally launch the daemon, use the launchctl command:

    launchctl load ~/Library/LaunchAgents/ch.frankel.generatejaveecours.plist

    Conclusion

    While automating HTML page generation on every push is quite straightforward, previewing HTML before the push requires manual action. However, with a bit of research, it’s possible to scan for changes on the file system and automate generation on the laptop as well. Be proud to be lazy!

    Categories: Development Tags: build automationAsciiDoctor
  • Strategies for optimizing Maven Docker images

    Last week, I wrote on how to design a generic Docker image for Maven-based auto-executable webapps. The designed build file has 3 different stages: checkout from Github, build with Maven and execute with Java. The Maven build stage takes quite a long time, mostly due to:

    1. Executing tests
    2. Downloading dependencies

    Tests can be executed earlier in the build chain - then skipped for Docker, this post will focus on speeding up the download of the dependencies. Let’s check each option in turn.

    Mount a volume

    The easiest option is to mount a volume to the local $HOME/.m2 folder when running Maven. This means dependencies are downloaded on the first run, and they are available on every run afterwards.

    The biggest constraint of going this way is that mounting volumes is only possible for run, not `build`. This is to enforce build stability. In the previous build file, the Maven build was not the final step. This is never the case in classical multi-stage builds. Hence, mounting volumes is definitely a no-go with multi-stage builds.

    Parent image with dependencies

    With build immutability in mind, an option is to "inherit" from a Maven image with a repository that is already provisioned with the required dependencies.

    Naive first approach

    Dependencies are specific to an app, so that they must be known early in the build stage. This requires reading the build descriptor file i.e. the pom.xml. Tailoring the parent image dependencies adds an additional build step, with no real benefits: dependencies will be downloaded nonetheless, just one step earlier.

    Eagerly download the Internet

    All dependencies - from all Maven repositories used by the organization (at least repo1), may be provisioned in the parent image. This image has to be refreshed to account for new versions being released. The obvious choice is to schedule building it at regular intervals. This unfortunately puts an unreasonable load on source repositories. For this reason, downloading the entire repo1 is frowned upon by the Maven community.

    DO NOT wget THE ENTIRE REPOSITORY!

    Please take only the jars you need. We understand this is may entail more work, but grabbing more than 1,7 TiB of binaries really kills our servers.

    Intermediate proxy

    In order to improve upon the previous solution, it’s possible to add an intermediate enterprise proxy repository i.e. Artifactory or Nexus. This way, the flow will look like the following:

    Components diagram
    Whether the enteprise repo is Dockerized or not is irrevelant and plays no role in the whole process.
    Create the parent image

    At regular intervals, the image is created not from remote repositories but from the proxy. At first, there will be no dependencies in the proxy. But after the initial build, the image will contain the required dependencies.

    Initial build

    The enterprise repository is empty. Dependencies will be downloaded from remote repositories, through the enterprise proxy, feeding it in the process.

    Next builds

    The enterprise repository is filled. Dependencies are already present in the image: no download is required. Note that if in the meantime a new dependency (or a new version thereof) is required, it will be downloaded during app build, but only it - which makes download time short. This allows to provision it into the enterprise repo, available for the next app and finally into the image.

    The only "trick" is for the App image to use the latest Maven Dependencies image as the parent i.e. FROM dependencies:latest , so as to benefit from dependencies as they are added into the image.

    Custom solution

    Depending on the app stack, there might be dedicated solutions. For example, regarding the Spring Boot framework, there’s the Thin Launcher. It allows to download dependencies at first run, instead of build time. As an added benefit, it keeps images very small in size as dependencies are not packaged in each one.

    Conclusion

    Barring an existing hack for a specific stack, putting an enterprise repository in front of remote ones allows for fastest downloads. However, scheduling the creation of a Docker image at regular intervals allows to completely skip downloads.

    Categories: Development Tags: containerbuildoptimization
  • A Dockerfile for Maven-based Github projects

    Docker logo

    Since the Docker "revolution", I’ve been interested in creating a Dockefile for Spring applications. I’m far from an ardent practitioner, but the principle of creating the Dockerfile is dead simple. As in Java - or probably any programming language, however, while it’s easy to achieve something that works, it’s much harder to create something that works well.

    Multi-stage builds

    In Docker, one of the main issue is the size of the final image. It’s not uncommon to end up with images over 1 GB even for simple Java applications. Since version 17.05 of Docker, it’s possible to have multiple builds in a single Dockerfile, and to access the output the previous build into the current one. Those are called multi-stage builds. The final image will be based on the last build stage.

    Let’s imagine the code is hosted on Github, and that it’s based on Maven. Build stages would be as follow:

    1. Clone the code from Github
    2. Copy the folder from the previous stage; build the app with Maven
    3. Copy the JAR from the previous stage; run it with java -jar

    Here is a build file to start from:

    FROM alpine/git
    WORKDIR /app
    RUN git clone https://github.com/spring-projects/spring-petclinic.git (1)
    
    FROM maven:3.5-jdk-8-alpine
    WORKDIR /app
    COPY --from=0 /app/spring-petclinic /app (2)
    RUN mvn install (3)
    
    FROM openjdk:8-jre-alpine
    WORKDIR /app
    COPY --from=1 /app/target/spring-petclinic-1.5.1.jar /app (4)
    CMD ["java -jar spring-petclinic-1.5.1.jar"] (5)

    It maps the above build stages:

    1 Clone the Spring PetClinic git repository from Github
    2 Copy the project folder from the previous build stage
    3 Build the app
    4 Copy the JAR from the previous build stage
    5 Run the app

    Improving readability

    Notice that in the previous build file, build stages are referenced via their index (starting from 0) e.g. COPY --from=0. While not a real issue, it’s always better to have something semantically meaningful. Docker allows to label stages and references those labels in later stages.

    FROM alpine/git as clone (1)
    WORKDIR /app
    RUN git clone https://github.com/spring-projects/spring-petclinic.git
    
    FROM maven:3.5-jdk-8-alpine as build (2)
    WORKDIR /app
    COPY --from=clone /app/spring-petclinic /app (3)
    RUN mvn install
    
    FROM openjdk:8-jre-alpine
    WORKDIR /app
    COPY --from=build /app/target/spring-petclinic-1.5.1.jar /app
    CMD ["java -jar spring-petclinic-1.5.1.jar"]
    1 Labels the first stage as clone
    2 Labels the second stage as build
    3 References the first stage using the label

    Choosing the right image(s)

    Multiple-stages build help tremendously with image size management, but it’s not the only criterion. The image to start with has a big impact on the final image. Beginners generally use full-blown Operating Systems images, such as Ubuntu, inflating the size of the final image for no good reason. Yet, there are lightweight OS, that are very well suited to Docker images, such as Alpine Linux.

    It’s also a great fit for security purposes, as the attack surface is limited.

    In the above build file, images are the following:

    1. alpine/git
    2. maven:3.5-jdk-8-alpine
    3. openjdk:8-jre-alpine

    They all inherit transitively or directly from alpine.

    Image sizes are as follow:

    REPOSITORY                      TAG                 IMAGE ID            CREATED             SIZE
    nfrankel/spring-petclinic       latest              293bd333d60c        10 days ago         117MB
    openjdk                         8-jre-alpine        c4f9d77cd2a1        4 weeks ago         81.4MB

    The difference in size between the JRE and the app images is around 36 MB which is the size of the JAR itself.

    Exposing the port

    The Spring Pet Clinic is a webapp, so it requires to expose the HTTP port it will bind to. The relevant Docker directive is EXPOSE. I choose 8080 as a port number to be the same as the embedded Tomcat container, but it could be anything. The last stage should be modified as such:

    FROM openjdk:8-jre-alpine
    WORKDIR /app
    COPY --from=build /app/target/spring-petclinic-1.5.1.jar /app
    EXPOSE 8080
    CMD ["java -jar spring-petclinic-1.5.1.jar"]

    Parameterization

    At this point, it appears the build file can be used for building any webapp with the following features:

    1. The source code is hosted on Github
    2. The build tool is Maven
    3. The resulting output is an executable JAR file

    Of course, that suits Spring Boot applications very well, but this is not a hard requirement.

    Parameters include:

    • The Github repository URL
    • The project name
    • Maven’s artifactId and version
    • The artefact name (as it might differ from the artifactId, depending on the specific Maven configuration)

    Let’s use those to design a parameterized build file. In Docker, parameters can be passed using either ENV or ARG options. Both are set using the --build-arg option on the command-line. Differences are the following:

    Type

    ENV

    ARG

    Found in the image

    Default value

    Required

    Optional

    FROM alpine/git as clone
    ARG url (1)
    WORKDIR /app
    RUN git clone ${url} (2)
    
    FROM maven:3.5-jdk-8-alpine as build
    ARG project (3)
    WORKDIR /app
    COPY --from=clone /app/${project} /app
    RUN mvn install
    
    FROM openjdk:8-jre-alpine
    ARG artifactid
    ARG version
    ENV artifact ${artifactid}-${version}.jar (4)
    WORKDIR /app
    COPY --from=build /app/target/${artifact} /app
    EXPOSE 8080
    CMD ["java -jar ${artifact}"] (5)
    1 url must be passed on the command line to set which Github repo to clone
    2 url is replaced by the passed value
    3 Same as <1>
    4 artifact must be an ENV, so as to be persisted in the final app image
    5 Use the artifact value at runtime

    Building

    The Spring Pet Clinic image can now be built using the following command-line:

    docker build --build-arg url=https://github.com/spring-projects/spring-petclinic.git\
      --build-arg project=spring-petclinic\
      --build-arg artifactid=spring-petclinic\
      --build-arg version=1.5.1\
      -t nfrankel/spring-petclinic - < Dockerfile
    Since the image doesn’t depend on the filesystem, no context needs to be passed and the Dockerfile can be piped from the standard input.

    To build another app, parameters can be changed accordingly e.g.:

    docker build --build-arg url=https://github.com/heroku/java-getting-started.git\
      --build-arg project=java-getting-started\
      --build-arg artifactid=java-getting-started\
      --build-arg version=1.0\
      -t nfrankel/java-getting-started - < Dockerfile

    Running

    Running an image built with the above command is quite easy:

    docker run -ti -p8080:8080 nfrankel/spring-petclinic

    Unfortunately, it fails with following error message: starting container process caused "exec: \"java -jar ${artifact}\": executable file not found in $PATH. The trick is to use the ENTRYPOINT directive. The updated Dockerfile looks as per the following:

    FROM openjdk:8-jre-alpine
    ARG artifactid
    ARG version
    ENV artifact ${artifactid}-${version}.jar (4)
    WORKDIR /app
    COPY --from=build /app/target/${artifact} /app
    EXPOSE 8080
    ENTRYPOINT ["sh", "-c"]
    CMD ["java -jar ${artifact}"] (5)

    As this point, running the container will (finally) work.

    the second image uses a different port, so the command should be: docker run -ti -p8080:5000 nfrankel/java-getting-started.

    Food for thoughts

    External Git

    The current build clones a repository, and hence doesn’t need sending the context to Docker. An alternative would be to clone outside the build file, e.g. in a continuous integration chain, and start from the context. That could be useful to build the image during development on developers machines.

    Latest version or not

    For the Git image, I used the latest version, while for the JDK and JRE images, I used a specific major version. It’s important for the Java version to be fixed to a major version, not so much for Git. It depends on the nature of the image.

    Building from master

    There’s no configuration to change branch after cloning. This is wrong, as most of the times, builds are executed on a dedicated tag e.g. v1.0.0. Obviously, there should be an additional command - as well as an additional build argument, to checkout a specific tag before the build.

    Skipping tests

    It takes an awful time for the Pet Clinic application to build, as it has a huge test harness. Executing those tests take time, and Maven must go through the test phase to reach the install phase. Depending on the specific continuous integration chain, tests might have been executed earlier and mustn’t be executed again during the image build.

    Maven repository download

    Spring Boot apps have a lot of dependencies. It takes a long time for the image to build, as all dependencies need to be downloaded every time for every app. There are a couple of solutions to handle that. It probably will be the subject of another post.

    Categories: Development Tags: dockercontainerbuild
  • Kotlin collections

    This post should have been the 5th in the Scala vs Kotlin serie.

    Unfortunately, I must admit I have a hard time reading the documentation of Scala collections e.g.:

    trait LinearSeq [+A] extends Seq[A] with collection.LinearSeq[A] with GenericTraversableTemplate[A, LinearSeq] with LinearSeqLike[A, LinearSeq[A]]

    Hence, I will only describe collections from the Kotlin side.

    Iterator

    At the root of Kotlin’s Collection API lies the Iterator interface, similar to Java’s. But the similitude stops after that. java.util.ListIterator features are broken down into different contracts:

    1. ListIterator to move the iterator index forward and backward
    2. MutableIterator to remove content from the iterator
    3. MutableListIterator inherits from the 2 interfaces above to mimic the entire contract of java.util.ListIterator
    Iterator API

    Collection, List and Set

    The hierarchy of collections in Kotlin are very similar as in Java: Collection, List and Set. (I won’t detail maps, but they follow the same design). The only, but huge, difference is that it’s divided between mutable and immutable types. Mutable types have methods to change their contents (_e.g. add() and `set()), while immutable types don’t.

    Of course, the hierarchy is a bit more detailed compared to Java, but that’s expected from a language that benefits from its parent’s experience.

    Collection, List and Set

    Implementations

    IMHO, the important bit about Kotlin collections is not their hierarchy - though it’s important to understand about the difference between mutable and immutable.

    As Java developers know, there’s no such things as out-of-the-box immutable collection types in Java. When an immutable collection is required, the mutable collection must be wrapped into an unmodifiable type via a call to the relevant Collections.unmodifiableXXX(). But unmodifiable types are not public, they are private in Collections: types returned are generic ones (List or Set interfaces). It means they implement all methods of the standard collections. Immutability comes from the mutable-related methods throwing exceptions: at compile time, there’s no way to differentiate between a mutable and an immutable collection.

    On the opposite, Kotlin offers a clean separation between mutable and immutable types. It also provides dedicated functions to create objects of the relevant type:

    Collections-creating functions

    As opposed to Scala, Kotlin doesn’t implement its own collection types, it reuses those from Java. That means that even when the compile-time type is immutable, the runtime type is always mutable. The downside is that it’s possible to change the collection elements by casting it to the correct runtime type. IMHO, this is no more severe than what allows standard reflection. There are several advantages, though:

    1. Faster time-to-market
    2. Java collection types benefited from years of improvement
    3. The underlying implementation can be changed in the future with full backward compatibility
    Categories: Development Tags: API
  • A use-case for Google Fusion Tables

    I’ve been traveling somewhat during the last few years, and I wanted to check on a map which countries I already visited. There’s one requirement: I want the whole area of countries I’ve visited to be highlighted. There are a couple of solutions for that:

    The Elastic stack

    Data can be stored in ElasticSearch, and there’s one world map visualization available for the Kibana dashboard. This works, but:

    1. The visualization handles geo-location, but no highlighting an entire country
    2. It requires some setup, and I’m a developer - read that I’m lazy.
    Dedicated app

    To be honest, I didn’t search but I assume "there is an app for that". But by using a dedicated app, you’re not the owner of your own data anymore and that doesn’t suit me.

    Google Maps

    Google Maps allows to add layers, and dedicated data. This solution would be a great fit, if it was possible to easily highlight countries. And still, there’s still a lot of JavaScript to write.

    Fusion Tables is an experimental service offered by Google.

    Creating the table

    While it’s possible to input the data directly into Fusion Tables, it’s easier to do that in a simple Google Spreadsheet. The following is sample data:

    Date Country Place Type Event

    2017/06

    Denmark

    Copenhagen

    Conference

    JDK.io

    2017/06

    Sweden

    Malmö

    Conference

    JKD.io

    2017/05

    Ukraine

    Kiev

    Conference

    JEEConf

    2017/05

    Greece

    Athens

    Conference

    Voxxed Days

    Such a spreadsheet can easily be imported into Fusion Tables.

    1. Connect Fusion Tables on Google Drive. From the Drive’s homepage, go on My Drive ▸ More ▸ Connect More Apps. Search for "fusion". Click on Connect.
      Connect Google Fusion Table on Drive
    2. Create a new Fusion Tables document. From the Drive’s homepage, click on New ▸ More ▸ Google Fusion Tables. Then select Google Spreadsheet.
      Import Google Spreadsheet into Fusion Tables
    3. Select the desired spreadsheet and click Select.
      Import new table
    4. Name the table accordingly and click Finish. It yields something akin to the following:
      Import new table

    Out-of-the-box, there’s a Map of Country tab that displays each data line on a world map. Unfortunately, it’s a simple dot at the center of the country. It doesn’t fulfil the initial requirement of highlighting the entire country area.

    Default world map view

    Changing the Location field to "Place" instead of "Country" will place dots at the correct location instead of the country center, but still no highlighting.

    Merging multiple Fusion Tables

    Fusion Tables support geometries that can be defined using the Keyhole Markup Language format. That can fulfil the highlighting requirement. Yet, that would mean defining the geometry of each country visited manually; it requires an effort I’m not prepared to make. Fortunately, it’s possible to "join" multiple tables - it’s called merging. Merging creates a new table, with both tables associated in it. Even better, if any of the initial table data changes, it’s reflected in the merged table.

    Good news: there’s an existing publicly accessible table defining all country geographies. Let’s merge the existing table with it in File ▸ Merge. In the Or paste a web address here field, paste the URL from the world countries above. Click Next. The opening pop-up requires to define the "join" columns of the tables.

    Default world map view

    Click Next. In the opening pop-up, tick the checkboxes of columns that will be part of the merged table. Click Merge. Wait for the merge to happen. Click View table.

    Now, on the world map tab, changing the Location field to "geometry" yields the expected result.

    Highlighted world map view

    At this point, the requirement is fulfilled. Further refinements would be to access the data via its REST API.

    Conclusion

    Fusion Tables is a no-fluff, just-stuff cloud service that allows to easily display data in various ways. With its ability to join on other tables, it’s easy to re-use existing tabular data.

    Categories: Technical Tags: databasedata visualization