• Feeding Spring Boot metrics to Elasticsearch

    Elasticsearch logo

    This week’s post aims to describe how to send JMX metrics taken from the JVM to an Elasticsearch instance.

    Business app requirements

    The business app(s) has some minor requirements.

    The easiest use-case is to start from a Spring Boot application. In order for metrics to be available, just add the Actuator dependency to it:

    <dependency>
      <groupId>org.springframework.boot</groupId>
      <artifactId>spring-boot-starter-actuator</artifactId>
    </dependency>
    

    Note that when inheriting from spring-boot-starter-parent, setting the version is not necessary and taken from the parent POM.

    To send data to JMX, configure a brand-new @Bean in the context:

    @Bean @ExportMetricWriter
    MetricWriter metricWriter(MBeanExporter exporter) {
        return new JmxMetricWriter(exporter);
    }
    

    To-be architectural design

    There are several options to put JMX data into Elasticsearch.

    Possible options

    1. The most straightforward way is to use Logstash with the JMX plugin
    2. Alternatively, one can hack his own micro-service architecture:
      • Let the application send metrics to the JVM - there’s the Spring Boot actuator for that, the overhead is pretty limited
      • Have a feature expose JMX data on an HTTP endpoint using Jolokia
      • Have a dedicated app poll the endpoint and send data to Elasticsearch

      This way, every component has its own responsibility, there’s not much performance overhead and the metric-handling part can fail while the main app is still available.

    3. An alternative would be to directly poll the JMX data from the JVM

    Unfortunate setback

    Any architect worth his salt (read lazy) should always consider the out-of-the-box option. The Logstash JMX plugin looks promising. After installing the plugin, the jmx input can be configured into the Logstash configuration file:

    input {
      jmx {
        path => "/var/logstash/jmxconf"
        polling_frequency => 5
        type => "jmx"
      }
    }
    
    output {
      stdout { codec => rubydebug }
    }
    

    The plugin is designed to read JVM parameters (such as host and port), as well as the metrics to handle from JSON configuration files. In the above example, they will be watched in the /var/logstash/jmxconf folder. Moreover, they can be added, removed and updated on the fly.

    Here’s an example of such configuration file:

    {
      "host" : "localhost",
      "port" : 1616,
      "alias" : "petclinic",
      "queries" : [
      {
        "object_name" : "org.springframework.metrics:name=*,type=*,value=*",
        "object_alias" : "${type}.${name}.${value}"
      }]
    }
    

    A MBean’s ObjectName can be determined from inside the jconsole:

    JConsole screenshot

    The plugin allows wildcard in the metric’s name and usage of captured values in the alias. Also, by default, all attributes will be read (those can be restricted if necessary).

    Note: when starting the business app, it’s highly recommended to set the JMX port through the com.sun.management.jmxremote.port system property.

    Unfortunately, at the time of this writing, running the above configuration fails with messages of this kind:

    [WARN][logstash.inputs.jmx] Failed retrieving metrics for attribute Value on object blah blah blah
    [WARN][logstash.inputs.jmx] undefined method `event' for #<LogStash::Inputs::Jmx:0x70836e5d>
    

    For reference purpose, the Github issue can be found here.

    The do-it yourself alternative

    Considering it’s easier to poll HTTP endpoints than JMX - and that implementations already exist, let’s go for option 3 above. Libraries will include:

    • Spring Boot for the business app
    • With the Actuator starter to provides metrics
    • Configured with the JMX exporter for sending data
    • Also with the dependency to expose JMX beans on an HTTP endpoints
    • Another Spring Boot app for the “poller”
    • Configured with a scheduled service to regularly poll the endpoint and send it to Elasticsearch

    Draft architecture component diagram

    Additional business app requirement

    To expose the JMX data over HTTP, simply add the Jolokia dependency to the business app:

    <dependency>
      <groupId>org.jolokia</groupId>
      <artifactId>jolokia-core</artifactId>
     </dependency>
    

    From this point on, one can query for any JMX metric via the HTTP endpoint exposed by Jolokia - by default, the full URL looks like /jolokia/read/<JMX_ObjectName>.

    Custom-made broker

    The broker app responsibilities include:

    • reading JMX metrics from the business app through the HTTP endpoint at regular intervals
    • sending them to Elasticsearch for indexing

    My initial move was to use Spring Data, but it seems the current release is not compatible with Elasticsearch latest 5 version, as I got the following exception:

    java.lang.IllegalStateException:
      Received message from unsupported version: [2.0.0] minimal compatible version is: [5.0.0]
    

    Besides, Spring Data is based on entities, which implies deserializing from HTTP and serializing back again to Elasticsearch: that has a negative impact on performance for no real added value.

    The code itself is quite straightforward:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    @SpringBootApplication
    @EnableScheduling
    open class JolokiaElasticApplication {
    
      @Autowired lateinit var client: JestClient
    
      @Bean open fun template() = RestTemplate()
    
      @Scheduled(fixedRate = 5000)
      open fun transfer() {
        val result = template().getForObject(
          "http://localhost:8080/manage/jolokia/read/org.springframework.metrics:name=status,type=counter,value=beans",
          String::class.java)
        val index = Index.Builder(result).index("metrics").type("metric").id(UUID.randomUUID().toString()).build()
        client.execute(index)
      }
    }
    
    fun main(args: Array<String>) {
      SpringApplication.run(JolokiaElasticApplication::class.java, *args)
    }
    

    Of course, it’s a Spring Boot application (line 1). To poll at regular intervals, it must be annotated with @EnableScheduling (line 2) and have the polling method annotated with @Scheduled and parameterized with the interval in milliseconds (line 9).

    In Spring Boot application, calling HTTP endpoints is achieved through the RestTemplate. Once created (line 7) - it’s singleton, it can be (re)used throughout the application. The call result is deserialized into a String (line 11).

    The client to use is Jest. Jest offers a dedicated indexing API: it just requires the JSON string to be sent, as well as the index name, the object name as well as its id (line 14). With the Spring Boot Elastic starter on the classpath, a JestClient instance is automatically registered in the bean factory. Just autowire it in the configuration (line 5) to use it (line 15).

    At this point, launching the Spring Boot application will poll the business app at regular intervals for the specified metrics and send it to Elasticsearch. It’s of course quite crude, everything is hard-coded, but it gets the job done.

    Conclusion

    Despite the failing plugin, we managed to get the JMX data from the business application to Elasticsearch by using a dedicated Spring Boot app.

  • Structuring data with Logstash

    Logstash old logo

    Given the trend around microservices, it has become mandatory to be able to follow a transaction across multiple microservices. Spring Cloud Sleuth is such a distributed tracing system fully integrated into the Spring Boot ecosystem. By adding the spring-cloud-starter-sleuth into a project’s POM, it instantly becomes Sleuth-enabled and every standard log call automatically adds additional data, such as spanId and traceId to the usual data.

    2016-11-25 19:05:53.221  INFO [demo-app,b4d33156bc6a49ec,432b43172c958450,false] 25305 ---\n
    [nio-8080-exec-1] ch.frankel.blog.SleuthDemoApplication      : this is an example message
    

    (broken on 2 lines for better readability)

    Now, instead of sending the data to Zipkin, let’s say I need to store it into Elasticsearch instead. A product is as good as the way it’s used. Indexing unstructured log messages is not very useful. Logstash configuration allows to pre-parse unstructured data and send structured data instead.

    Grok

    Grokking data is the usual way to structure data with pattern matching.

    Last week, I wrote about some hints for the configuration. Unfortunately, the hard part comes in writing the matching pattern itself, and those hints don’t help. While it might be possible to write a perfect Grok pattern on the first draft, the above log is complicated enough that it’s far from a certainty, and chances are high to stumble upon such message when starting Logstash with an unfit Grok filter:

    "tags" => [
        [0] "_grokparsefailure"
    ]
    

    However, there’s “an app for that” (sounds familiar?). It offers three fields:

    1. The first field accepts one (or more) log line(s)
    2. The second the grok pattern
    3. The 3rd is the result of filtering the 1st by the 2nd

    The process is now to match fields one by one, from left to right. The first data field .e.g. 2016-11-25 19:05:53.221 is obviously a timestamp. Among common grok patterns, it looks as if the TIMESTAMP_ISO8601 pattern would be the best fit.

    Enter %{TIMESTAMP_ISO8601:timestamp} into the Pattern field. The result is:

    {
      "timestamp": [
        [
          "2016-11-25 17:05:53.221"
        ]
      ]
    }
    

    The next field to handle looks like the log level. Among the patterns, there’s one LOGLEVEL. The Pattern now becomes %{TIMESTAMP_ISO8601:timestamp} *%{LOGLEVEL:level} and the result:

    {
      "timestamp": [
        [
          "2016-11-25 17:05:53.221"
        ]
      ],
      "level": [
        [
          "INFO"
        ]
      ]
    }
    

    Rinse and repeat until all fields have been structured. Given the initial log line, the final pattern should look something along those lines:

    %{TIMESTAMP_ISO8601:timestamp} *%{LOGLEVEL:level} \[%{DATA:application},%{DATA:traceId},%{DATA:spanId},%{DATA:zipkin}]\n
    %{DATA:pid} --- *\[%{DATA:thread}] %{JAVACLASS:class} *: %{GREEDYDATA:log}
    

    (broken on 2 lines for better readability)

    And the associated result:

    {
            "traceId" => "b4d33156bc6a49ec",
             "zipkin" => "false",
              "level" => "INFO",
                "log" => "this is an example message",
                "pid" => "25305",
             "thread" => "nio-8080-exec-1",
               "tags" => [],
             "spanId" => "432b43172c958450",
               "path" => "/tmp/logstash.log",
         "@timestamp" => 2016-11-26T13:41:07.599Z,
        "application" => "demo-app",
           "@version" => "1",
               "host" => "LSNM33795267A",
              "class" => "ch.frankel.blog.SleuthDemoApplication",
          "timestamp" => "2016-11-25 17:05:53.221"
    }
    

    Dissect

    The Grok filter gets the job done. But it seems to suffer from performance issues, especially if the pattern doesn’t match. An alternative is to use the dissect filter instead, which is based on separators.

    Unfortunately, there’s no app for that - but it’s much easier to write a separator-based filter than a regex-based one. The mapping equivalent to the above is:

    %{timestamp} %{+timestamp} %{level}[%{application},%{traceId},%{spanId},%{zipkin}]\n
    %{pid} %{}[%{thread}] %{class}:%{log}
    

    (broken on 2 lines for better readability)

    This outputs the following:

    {
            "traceId" => "b4d33156bc6a49ec",
             "zipkin" => "false",
                "log" => " this is an example message",
              "level" => "INFO ",
                "pid" => "25305",
             "thread" => "nio-8080-exec-1",
               "tags" => [],
             "spanId" => "432b43172c958450",
               "path" => "/tmp/logstash.log",
         "@timestamp" => 2016-11-26T13:36:47.165Z,
        "application" => "demo-app",
           "@version" => "1",
               "host" => "LSNM33795267A",
              "class" => "ch.frankel.blog.SleuthDemoApplication      ",
          "timestamp" => "2016-11-25 17:05:53.221"
    }
    

    Notice the slight differences: by moving from a regex-based filter to a separator-based one, some strings end up padded with spaces. There are 2 ways to handle that:

    • change the logging pattern in the application - which might make direct log reading harder
    • strip additional spaces with Logstash

    With the second option, the final filter configuration snippet is:

    filter {
      dissect {
    	mapping => { "message" => ... }
      }
      mutate {
        strip => [ "log", "class" ]
      }
    }
    

    Conclusion

    In order to structure data, the grok filter is powerful and used by many. However, depending on the specific log format to parse, writing the filter expression might be quite complex a task. The dissect filter, based on separators, is an alternative that makes it much easier - at the price of some additional handling. It also is an option to consider in case of performance issues.

    Categories: Development Tags: elklogstashelasticparsingdata
  • Debugging hints for Logstash

    Logstash old logo

    As a Java developer, when you are first shown how to run the JVM in debug mode, attach to it and then set a breakpoint, you really feel like you’ve reached a step on your developer journey. Well, at least I did. Now the world is going full microservice and knowing that trick means less and less in it everyday.

    This week, I was playing with Logstash to see how I could send all of an application exceptions to an Elasticsearch instance, so I could display them on a Kibana dashboard for analytics purpose. Of course, nothing was seen in Elasticsearch at first. This post describes helped me toward making it work in the end.

    The setup

    Components are the following:

    • The application. Since a lot of exceptions were necessary, I made use of the Java Bullshifier. The only adaptation was to wire in some code to log exceptions in a log file.
      public class ExceptionHandlerExecutor extends ThreadPoolExecutor {
      
          private static final Logger LOGGER = LoggerFactory.getLogger(ExceptionHandlerExecutor.class);
      
          public ExceptionHandlerExecutor(int corePoolSize, int maximumPoolSize,
                  long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue) {
              super(corePoolSize, maximumPoolSize, keepAliveTime, unit, workQueue);
          }
      
          @Override
          protected void afterExecute(Runnable r, Throwable t) {
              if (r instanceof FutureTask) {
                  FutureTask<Exception> futureTask = (FutureTask<Exception>) r;
                  if (futureTask.isDone()) {
                      try {
                          futureTask.get();
                      } catch (InterruptedException | ExecutionException e) {
                          LOGGER.error("Uncaught error", e);
                      }
                  }
              }
          }
      }
    • Logstash
    • Elasticsearch
    • Kibana

    The first bump

    Before being launched, Logstash needs to be configured - especially its input and its output.

    There’s no out-of-the-box input focused on exception stack traces. Those are multi-lines messages: hence, only lines starting with a timestamp mark the beginning of a new message. Logstash achieves that with a specific codec including a regex pattern.

    Some people, when confronted with a problem, think "I know, I'll use regular expressions."
    Now they have two problems.
    -- Jamie Zawinski

    Some are better than others at regular expressions but nobody learned it as his/her mother tongue. Hence, it’s not rare to have errors. In that case, one should use one of the many available online regex validators. They are just priceless for understanding why some pattern doesn’t match.

    The relevant Logstash configuration snippet becomes:

    input {
      file {
        path => "/tmp/logback.log"
        start_position => "beginning"
        codec => multiline {
          pattern => "^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"
          negate => true
          what => "previous"
        }
      }
    }
    

    The second bump

    Now, data found its way into Elasticsearch, but message were not in the expected format. In order to analyze where this problem came from, messages can be printed on the console instead of indexed in Elasticsearch.

    That’s quite easy with the following snippet:

    output {
      stdout { codec => rubydebug }
    }
    

    With messages printed on the console, it’s possible to understand where the issue occurs. In that case, I was able to tweak the input configuration (and add the forgotten negate => true bit).

    Finally, I got the expected result:

    Conclusion

    With more and more tools with every passing day, the toolbelt of the modern developer needs to increase as well. Unfortunately, there’s no one-size-fits-all solution: in order to know a tool’s every nook and cranny, one needs to use and re-use it, be creative, and search on Google… a lot.

    Categories: Development Tags: elklogstashelasticdebugging
  • Another post-processor for Spring Boot

    Spring Boot logo

    Most Spring developers know about the BeanPostProcessor and the BeanFactoryPostProcessor classes. The former enables changes to new bean instances before they can be used, while the latter lets you modify bean definitions - the metadata to create the bean. Commons use-cases include:

    • Bootstrapping processing of @Configuration classes, via ConfigurationClassPostProcessor
    • Resolving ${...} placeholders, through PropertyPlaceholderConfigurer
    • Autowiring of annotated fields, setter methods and arbitrary config methods - AutowiredAnnotationBeanPostProcessor
    • And so on, and so forth…

    Out-of-the-box and custom post-processors are enough to meet most requirements regarding the Spring Framework proper.

    Then comes Spring Boot, with its convention over configuration approach. Among its key features is the ability to read configuration from different sources such as the default application.properties, the default application.yml, another configuration file and/or System properties passed on the command line. What happens behind the scene is that they all are merged into an Environment instance. This object can then be injected into any bean and queried for any value by passing the key.

    As a starter designer, how to define a default configuration value? Obviously, it cannot be set via a Spring @Bean-annotated method. Let’s analyze the Spring Cloud Sleuth starter as an example:

    Spring Cloud Sleuth implements a distributed tracing solution for Spring Cloud, borrowing heavily from Dapper, Zipkin and HTrace. For most users Sleuth should be invisible, and all your interactions with external systems should be instrumented automatically. You can capture data simply in logs, or by sending it to a remote collector service.

    Regarding the configuration, the starter changes the default log format to display additional information (to be specific, span and trace IDs but that’s not relevant to the post). Let’s dig further.

    As for auto-configuration classes, the magic starts in the META-INF/spring.factories file in the Spring Cloud Sleuth starter JAR:

    # Environment Post Processor
    org.springframework.boot.env.EnvironmentPostProcessor=\
    org.springframework.cloud.sleuth.autoconfig.TraceEnvironmentPostProcessor
    

    The definition of the interface looks like the following:

    public interface EnvironmentPostProcessor {
      void postProcessEnvironment(ConfigurableEnvironment environment, SpringApplication application);
    }
    

    And the implementation like that:

    public class TraceEnvironmentPostProcessor implements EnvironmentPostProcessor {
    
      private static final String PROPERTY_SOURCE_NAME = "defaultProperties";
    
      @Override
      public void postProcessEnvironment(ConfigurableEnvironment environment, SpringApplication application) {
        Map<String, Object> map = new HashMap<String, Object>();
        map.put("logging.pattern.level",
          "%clr(%5p) %clr([${spring.application.name:},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]){yellow}");
        map.put("spring.aop.proxyTargetClass", "true");
        addOrReplace(environment.getPropertySources(), map);
      }
    
      private void addOrReplace(MutablePropertySources propertySources, Map<String, Object> map) {
        MapPropertySource target = null;
        if (propertySources.contains(PROPERTY_SOURCE_NAME)) {
          PropertySource<?> source = propertySources.get(PROPERTY_SOURCE_NAME);
          if (source instanceof MapPropertySource) {
            target = (MapPropertySource) source;
            for (String key : map.keySet()) {
              if (!target.containsProperty(key)) {
                target.getSource().put(key, map.get(key));
              }
            }
          }
        }
        if (target == null) {
          target = new MapPropertySource(PROPERTY_SOURCE_NAME, map);
        }
        if (!propertySources.contains(PROPERTY_SOURCE_NAME)) {
          propertySources.addLast(target);
        }
      }
    }
    

    As can be seen, the implementation will add both the logging.pattern.level and the spring.aop.proxyTargetClass properties (with relevant values) to the environment (if they don’t exist yet). If they do, they will be added at the bottom of the list.

    With @Conditional, starters can provide default beans in auto-configuration classes, while with EnvironmentPostProcessor, they can provide default property values as well. Using both in conjunction can go a long way toward offering a great convention over configuration Spring Boot experience when designing your own starter.

  • Travis CI tutorial Java projects

    Travis CI logo

    As a consultant, I’ve done a lot of Java projects in different “enterprise” environments. In general, the Continuous Integration stack - when there’s one, is comprised of:

    • Github Enterprise or Atlassian Stash for source version control
    • Jenkins as the CI server, sometimes but rarely Atlassian Bamboo
    • Maven for the build tool
    • JaCoCo for code coverage
    • Artifactory as the artifacts repository - once I had Nexus

    Recently, I started to develop Kaadin, a Kotlin-DSL to design Vaadin applications. I naturally hosted it on Github, and wanted the same features as for my real-life projects above. This post describes how to achieve all desired features using a whole new stack, that might not be familiar with enterprise Java developers.

    Github was a perfect match. Then I went on to search for a Jenkins cloud provider to run my builds… to no avail. This wasn’t such a surprise, as I already searched for that last year for a course on Continuous Integration without any success. I definitely could have installed my own instance on any IaaS platform, but it always pays to be idiomatic. There are plenty of Java projects hosted on Github. They mostly use Travis CI, so I went down this road.

    Travis CI registration

    Registering is as easy as signing in using Github credentials, accepting all conditions and setting the project(s) that need to be built.

    Travis CI projects choice

    Basics

    The project configuration is read from a .travis.yml file at its root. Historically, the platform was aimed at Ruby, but nowadays different kind of projects in different languages can be built. The most important configuration part is to define the language. Since mine is a Java project, the second most important is which JDK to use:

    language: java
    jdk: oraclejdk8
    

    From that point on, every push on the Github repository will trigger the build - including branches.

    Compilation

    If a POM exists at the root of the project, it’s automatically detected and Maven (or the Maven wrapper) will be used as the buidl tool in that case. It’s good idea to use the wrapper for it sets the Maven version. Baring that, there’s no hint of the version used.

    As written above, Travis CI was originally designed for Ruby projects. In Ruby, dependencies are installed system-wide, not per project as for Maven. Hence, the lifecycle is composed of:

    1. an install dependencies phase, translated by default into ./mvnw install -DskipTests=true -Dmaven.javadoc.skip=true -B -V
    2. a build phase, explicitly defined by each project

    This two-phase mapping is not relevant for Maven projects as Maven uses a lazy approach: if dependencies are not in the local repository, they will be downloaded when required. For a more idiomatic management, the first phase can be bypassed, while the second one just needs to be set:

    install: true
    script: ./mvnw clean install
    

    Improve build speed

    In order to speed up future builds, it’s a good idea to keep the Maven local repository between different runs, as it would be the case on Jenkins or a local machine. The following configuration achieves just that:

    cache:
      directories:
      - $HOME/.m2
    

    Tests and failing the build

    As for standard Maven builds, phases are run sequentially, so calling install will first use compile and then test. A single failing test, and the build will fail. A report is sent by email when it happens.

    A build failed in Travis CI

    Most Open Source projects display their build status badge on their homepage to build trust with their users. This badge is provided by Travis CI, it just needs to be hot-linked, as seen below:

    Kaadin build status

    Code coverage

    Travis CI doesn’t use JaCoCo reports generated during the Maven build. One has to use another tool. There are several available: I chose https://codecov.io/, for no other reason than because Mockito also uses it. The drill is the same, register using Github, accept conditions and it’s a go.

    Codecov happily uses JaCoCo coverage reports, but chooses to display only lines coverage - the less meaningful metrics IMHO. Yet, it’s widespread enough, so let’s display it to users via a nice badge that can be hot-linked:

    Codecov

    The build configuration file needs to be updated:

    script:
      - ./mvnw clean install
      - bash <(curl -s https://codecov.io/bash)
    

    On every build, Travis CI calls the online Codecov shell script, which will somehow update the code coverage value based on the JaCoCo report generated during the previous build command.

    Deployment to Bintray

    Companies generally deploy built artifacts on an internal repository. Open Source projects should be deployed on public repositories for users to download - this means Bintray and JCenter. Fortunately, Travis CI provides a lot of different remotes to deploy to, including Bintray.

    Usually, only a dedicated branch should deploy to the remote e.g. release. That parameter is available in the configuration file:

    deploy:
      -
        on:
          branch: release
        provider: bintray
        skip_cleanup: true
        file: target/bin/bintray.json
        user: nfrankel
        key: $BINTRAY_API_KEY
    

    Note the above $BINTRAY_API_KEY variable. Travis CI offers environment variables to allow for some flexibility. Each of them can be defined as not to be displayed in the logs. Beware that in that case, it’s treated as a secret and it cannot be displayed again in the user interface.

    Travis CI environment variables

    For Bintray, it means getting hold of the API key on Bintray, creating a variable with a relevant name and setting its value. Of course, other variables can be created, as many as required.

    Most of the deployment configuration is delegated to a dedicated Bintray-specific JSON file. Among other information, it contains both the artifactId and version values from the Maven POM. Maven filtering is configured to automatically get the content from the POM and source it to the configuration file to use: notice the path reference is under the generated target folder.

    Conclusion

    Open Source development on Github requires a different approach than for in-house enterprise development. In particular, the standard build pipeline is based on a completely different stack. Using this stack is quite time-consuming because of different defaults and all the documentation that needs to be read, but it’s far from impossible.

  • Easy DSL design with Kotlin

    Kotlin logo

    In Android, every tutorial teaching you the basics describe how to design screen through XML files. It’s also possible to achieve the same result with Java (or any JVM-based language). Android screen design is not the only domain where XML and Java are valid options. For example, Spring configuration and Vaadin screen design allow both. In all those domains, there’s however a trade-off involved: on one hand, XML has a quite rigid structure enforced by an XML-schema while Java gives power to do pretty well anything at the cost of readability. “With great power comes great responsibility”. In the latter case, it’s up to the individual developer to exercise his/her judgement in order to keep the code as readable as possible.

    Domain-Specific Languages sit between those two ends of the spectrum, as they offer a structured syntax close to the problem at hand but with the full support of the underlying language constructs when necessary. As an example, the AssertJ library provides a DSL for assertions, using a fluent API. The following snippet is taken from its documentation:

    // entry point for all assertThat methods and utility methods (e.g. entry)
    import static org.assertj.core.api.Assertions.*;
    
    // collection specific assertions (there are plenty more)
    // in the examples below fellowshipOfTheRing is a List<TolkienCharacter>
    assertThat(fellowshipOfTheRing).hasSize(9)
                                   .contains(frodo, sam)
                                   .doesNotContain(sauron);
    

    DSLs can be provided in any language, even if some feel more natural. Scala naturally comes to mind, but Kotlin is also quite a great match. Getting back to Android, Jetbrains provides the excellent Anko library to design Android screen using a Kotlin-based DSL. The following snippet highlights Anko (taken from the doc):

    verticalLayout {
        val name = editText()
        button("Say Hello") {
            onClick { toast("Hello, ${name.text}!") }
        }
    }
    

    There are two Kotlin language constructs required for that.

    Extension functions

    I already wrote about extension functions, so I’ll be pretty quick about it. In essence, extension functions are a way to add behavior to an existing type.

    For example, String has methods to set in lower/upper case but nothing to capitalize. With extension functions, it’s quite easy to fill the gap:

    fun String.toCapitalCase() = when {
        length < 2 -> toUpperCase()
        else -> this[0].toUpperCase() + substring(1).toLowerCase()
    }
    

    At this point, usage is straightforward: "foo".toCapitalCase().

    The important point is the usage of this in the above snippet: it refers to the actual string instance.

    Function types

    In Kotlin, function are types. Among all consequences, this means functions can be passed as function parameters.

    fun doSomethingWithInt(value: Int, f: (Int) -> Unit) {
        value.apply(f);
    }
    

    The above function can be now passed any function that takes an Int as a parameter and returns Unit e.g. { print(it) }. It’s called like that:

    doSomethingWithInt(5, { print(it) })
    

    Now, Kotlin offers syntactic sugar when calling methods: if the lambda is the last parameter in a function call, it can be separated from the other arguments like this:

    doSomethingWithInt(5) { print(it) }
    

    Putting it all together

    Getting back to the Anko snippet above, let’s check how the verticalLayout { ... } method is defined:

    inline fun ViewManager.verticalLayout(theme: Int = 0, init: _LinearLayout.() -> Unit): LinearLayout {
        return ankoView(`$$Anko$Factories$CustomViews`.VERTICAL_LAYOUT_FACTORY, theme, init)
    }

    As seen in the first paragraph, the init parameter is an extension function defined on the _LinearLayout type. this used in its context will refer to the instance of this latter type.

    The second paragraph explained what represents the content of the braces: the init parameter function.

    Conclusion

    Hopefully, this post show how easy it is to create DSL with Kotlin thanks to its syntax.

    I developed one such DSL to create GUI for the Vaadin framework: the name is Kaadin and the result is available online.

    Categories: Kotlin Tags: dsl
  • HTTP headers forwarding in microservices

    Spring Cloud logo

    Microservices are not a trend anymore. Like it or not, they are here to stay. Yet, there’s a huge gap before embracing the microservice architecture and implementing them right. As a reminder, one might first want to check the many fallacies of distributed computed. Among all requirements necessary to overcome them is the ability to follow one HTTP request along microservices involved in a specific business scenario - for monitoring and debugging purpose.

    One possible implementation of it is a dedicated HTTP header with an immutable value passed along every microservice involved in the call chain. This week, I developed this monitoring of sort on a Spring microservice and would like to share how to achieve that.

    Headers forwarding out-of-the-box

    In the Spring ecosystem, the Spring Cloud Sleuth is the library dedicated to that:

    Spring Cloud Sleuth implements a distributed tracing solution for Spring Cloud, borrowing heavily from Dapper, Zipkin and HTrace. For most users Sleuth should be invisible, and all your interactions with external systems should be instrumented automatically. You can capture data simply in logs, or by sending it to a remote collector service.

    Within Spring Boot projects, adding the Spring Cloud Sleuth library to the classpath will automatically add 2 HTTP headers to all calls:

    X-B3-Traceid
    Shared by all HTTP calls of a single transaction i.e. the wished-for transaction identifier
    X-B3-Spanid
    Identifies the work of a single microservice during a transaction

    Spring Cloud Sleuth offers some customisation capabilities, such as alternative header names, at the cost of some extra code.

    Diverging from out-of-the-box features

    Those features are quite handy when starting from a clean slate architecture. Unfortunately, the project I’m working has a different context:

    • The transaction ID is not created by the first microservice in the call chain - a mandatory façade proxy does
    • The transaction ID is not numeric - and Sleuth handles only numeric values
    • Another header is required. Its objective is to group all requests related to one business scenario across different call chains
    • A third header is necessary. It’s to be incremented by each new microservice in the call chain

    A solution architect’s first move would be to check among API management products, such as Apigee (recently bought by Google) and search which one offers the feature matching those requirements. Unfortunately, the current context doesn’t allow for that.

    Coding the requirements

    In the end, I ended up coding the following using the Spring framework:

    1. Read and store headers from the initial request
    2. Write them in new microservice requests
    3. Read and store headers from the microservice response
    4. Write them in the final response to the initiator, not forgetting to increment the call counter

    UML modeling of the header flow

    Headers holder

    The first step is to create the entity responsible to hold all necessary headers. It’s unimaginatively called HeadersHolder. Blame me all you want, I couldn’t find a more descriptive name.

    private const val HOP_KEY = "hop"
    private const val REQUEST_ID_KEY = "request-id"
    private const val SESSION_ID_KEY = "session-id"
    
    data class HeadersHolder (var hop: Int?,
                              var requestId: String?,
                              var sessionId: String?)
    

    The interesting part is to decide which scope is more relevant to put instances of this class in. Obviously, there must be several instances, so this makes singleton not suitable. Also, since data must be stored across several requests, it cannot be prototype. In the end, the only possible way to manage the instance is through a ThreadLocal.

    Though it’s possible to manage ThreadLocal, let’s leverage Spring’s features since it allows to easily add new scopes. There’s already an out-of-the-box scope for ThreadLocal, one just needs to register it in the context. This directly translates into the following code:

    internal const val THREAD_SCOPE = "thread"
    
    @Scope(THREAD_SCOPE)
    annotation class ThreadScope
    
    @Configuration
    open class WebConfigurer {
    
        @Bean @ThreadScope
        open fun headersHolder() = HeadersHolder()
    
        @Bean open fun customScopeConfigurer() = CustomScopeConfigurer().apply {
            addScope(THREAD_SCOPE, SimpleThreadScope())
        }
    }
    

    Headers in the server part

    Let’s implement requirements 1 & 4 above: read headers from the request and write them to the response. Also, headers need to be reset after the request-response cycle to prepare for the next.

    This also mandates for the holder class to be updated to be more OOP-friendly:

    data class HeadersHolder private constructor (private var hop: Int?,
                                                  private var requestId: String?,
                                                  private var sessionId: String?) {
        constructor() : this(null, null, null)
    
        fun readFrom(request: HttpServletRequest) {
            this.hop = request.getIntHeader(HOP_KEY)
            this.requestId = request.getHeader(REQUEST_ID_KEY)
            this.sessionId = request.getHeader(SESSION_ID_KEY)
        }
    
        fun writeTo(response: HttpServletResponse) {
            hop?.let { response.addIntHeader(HOP_KEY, hop as Int) }
            response.addHeader(REQUEST_ID_KEY, requestId)
            response.addHeader(SESSION_ID_KEY, sessionId)
        }
    
        fun clear() {
            hop = null
            requestId = null
            sessionId = null
        }
    }
    

    To keep controllers free from any header management concern, related code should be located in a filter or a similar component. In the Spring MVC ecosystem, this translates into an interceptor.

    abstract class HeadersServerInterceptor : HandlerInterceptorAdapter() {
    
        abstract val headersHolder: HeadersHolder
    
        override fun preHandle(request: HttpServletRequest,
                               response: HttpServletResponse, handler: Any): Boolean {
            headersHolder.readFrom(request)
            return true
        }
    
        override fun afterCompletion(request: HttpServletRequest, response: HttpServletResponse,
                                     handler: Any, ex: Exception?) {
            with (headersHolder) {
                writeTo(response)
                clear()
            }
        }
    }
    
    @Configuration open class WebConfigurer : WebMvcConfigurerAdapter() {
    
        override fun addInterceptors(registry: InterceptorRegistry) {
            registry.addInterceptor(object : HeadersServerInterceptor() {
                override val headersHolder: HeadersHolder
                    get() = headersHolder()
            })
        }
    }
    

    Note the invocation of the clear() method to reset the headers holder for the next request.

    The most important bit is the abstract headersHolder property. As its scope - thread, is smaller than the adapter’s, it cannot be injected directly less it will be only injected during Spring’s context startup. Hence, Spring provides lookup method injection. The above code is its direct translation in Kotlin.

    Headers in client calls

    The previous code assumes the current microservice is at the end of the caller chain: it reads request headers and writes them back in the response (not forgetting to increment the ‘hop’ counter). However, monitoring is relevant only for a caller chain having more than one single link. How is it possible to pass headers to the next microservice (and get them back) - requirements 2 & 3 above?

    Spring provides a handy abstraction to handle that client part - ClientHttpRequestInterceptor, that can be registered to a REST template. Regarding scope mismatch, the same injection trick as for the interceptor handler above is used.

    abstract class HeadersClientInterceptor : ClientHttpRequestInterceptor {
    
        abstract val headersHolder: HeadersHolder
    
        override fun intercept(request: HttpRequest, 
                               body: ByteArray, execution: ClientHttpRequestExecution): ClientHttpResponse {
            with(headersHolder) {
                writeTo(request.headers)
                return execution.execute(request, body).apply {
                    readFrom(this.headers)
                }
            }
        }
    }
    
    @Configuration
    open class WebConfigurer : WebMvcConfigurerAdapter() {
    
        @Bean open fun headersClientInterceptor() = object : HeadersClientInterceptor() {
            override val headersHolder: HeadersHolder
                get() = headersHolder()
        }
    
        @Bean open fun oAuth2RestTemplate() = OAuth2RestTemplate(clientCredentialsResourceDetails()).apply {
            interceptors = listOf(headersClientInterceptor())
        }
    }
    

    With this code, every REST call using the oAuth2RestTemplate() will have headers managed automatically by the interceptor.

    The HeadersHolder just needs a quick update:

    data class HeadersHolder private constructor (private var hop: Int?,
                                                  private var requestId: String?,
                                                  private var sessionId: String?) {
    
        fun readFrom(headers: org.springframework.http.HttpHeaders) {
            headers[HOP_KEY]?.let {
                it.getOrNull(0)?.let { this.hop = it.toInt() }
            }
            headers[REQUEST_ID_KEY]?.let { this.requestId = it.getOrNull(0) }
            headers[SESSION_ID_KEY]?.let { this.sessionId = it.getOrNull(0) }
        }
    
        fun writeTo(headers: org.springframework.http.HttpHeaders) {
            hop?.let { headers.add(HOP_KEY, hop.toString()) }
            headers.add(REQUEST_ID_KEY, requestId)
            headers.add(SESSION_ID_KEY, sessionId)
        }
    }
    

    Conclusion

    Spring Cloud offers many components that can be used out-of-the-box when developing microservices. When requirements start to diverge from what it provides, the flexibility of the underlying Spring Framework can be leveraged to code those requirements.

  • Extension functions for more consistent APIs

    Hair extensions

    Kotlin’s extension functions are a great way to add behavior to a type sitting outside one’s control - the JDK or a third-party library.

    For example, the JDK’s String class offers the toLowerCase() and toUpperCase() methods but nothing to capitalize the string. In Kotlin, this can be helped by adding the desired behavior to the String class through an extension function:

    fun String.capitalize() = when {
        length < 2 -> toUpperCase()
        else -> Character.toUpperCase(toCharArray()[0]) + substring(1).toLowerCase()
    }
    
    println("hello".capitalize())
    

    Extension functions usage is not limited to external types, though. It can also improve one’s own codebase, to handle null values more elegantly.

    This is a way one would define a class and a function in Kotlin:

    class Foo {
        fun foo() = println("foo")
    }
    

    Then, it can be used on respectively non-nullable and nullable types like that:

    val foo1 = Foo()
    val foo2: Foo? = Foo()
    foo1.foo()
    foo2?.foo()
    

    Notice the compiler enforces the usage of the null-safe ?. operator on the nullable type instance to prevent NullPointerException.

    Instead of defining the foo() method directly on the type, let’s define it as an extension function, but with a twist. Let’s make the ?. operator part of the function definition:

    class Foo
    
    fun Foo?.safeFoo() = println("null-safe foo")
    

    Usage is now slightly modified:

    val foo1 = Foo()
    val foo2: Foo? = Foo()
    val foo3: Foo? = null
    foo1.safeFoo()
    foo2.safeFoo()
    foo3.safeFoo()
    

    Whether the type is non-nullable or not, the calling syntax is consistent. Interestingly enough, the output is the following:

    null-safe foo
    null-safe foo
    null-safe foo
    

    Yes, it’s possible to call a method on a null instances! Now, let’s update the Foo class slightly:

    class Foo {
        val foo = 1
    }
    

    What if the foo() extension function should print the foo property instead of a constant string?

    fun Foo?.safeFoo() = println(foo)
    

    The compiler complains:

    Only safe (?.) or non-null asserted (!!.) calls are allowed on a nullable receiver of type Foo?
    

    The extension function needs to be modified according to the compiler’s error:

    fun Foo?.safeFoo() = println(this?.foo)
    

    The output becomes:

    1
    1
    null
    

    Extension functions are a great way to make API more consistent and to handle null elegantly instead of dropping the burden on caller code.

    Categories: Kotlin Tags: designAPIextension function
  • Making sure inter-teams communication doesn't work

    The 3 monkeys

    This post explains the reasons behind this tweet, sent on the spur of the moment.

    Let’s face it, most developers - myself included, prefer to deal with code than with people. However, when communicating with other developers, we share a common context and understanding, which makes a lot of explanations unnecessary. It’s the same for all specialized jobs, such as lawyers or civil engineers.

    Now, if I have to communicate with a developer (or other technical people) of another team, I’d approach him directly and have a chat. Sometimes, managers need to be involved for they have to be in the know and/or take decisions. In that case, I’d organize a meeting with this dev and both our managers, we would chat, come to a common understanding and then explain the meat of the stuff to managers and let them decide (and perhaps influence them a little…).

    That would be crazy to spend a lot of time trying to explain to my non-technical manager the core matter, let him deal with the other manager without my presence, and finally let this other manager handle the issue with his team. Any child who has played Chinese whispers knows that multiple layers of intermediaries in passing a message result in a garbled mess. If a simple message that every layer can understand gets distorted, what would be the result when it becomes complex? And when not every player can understand it perfectly?

    I’ve dealt with this kind of “process” early in my career - a lot. I assume it was for 2 reasons:

    Silo'ed structures
    Because companies are generally organized in columns (Chief Financial Officer, Chief Marketing Officer, Chief Technology Officer, Chief Human Resource Officer, etc.) , communication is mainly vertical - from top to bottom (hardly reversed). As objectives are completely different, trusting another team is quite hard. So, this is a way to keep things in control... Or at least to pretend to do so.
    Middle manager
    Cohorts of middle managers were required in factories, to handle assembly line workers and most traditional companies mirror that organization perfectly. In tech companies, the workforce is composed of educated developers: the need for middle managers is nowhere near as much. Hence, they must prove their "value" to the organization or be exposed as useless. Setting themselves as necessary intermediaries has become a survival strategy.

    I was naively hoping that with all Agile projects around, this kind of behavior would have had disappeared. So imagine my surprise that it happened to me this week. After having explained to my manager the above part - about communication efficiency, I was gently (but firmly) told that it was not in his power to change that. I pushed back again pointing that if nobody does anything, things will never improve, but without results.

    If you find yourself in this situation and have no way of improving the state of things, I’d suggest you vote with your feet. In my case, I’m near the end of my collaboration with this specific team, so my problem will resolve itself soon enough. I’m afraid the “process” is too deeply ingrained to change in the nearest future - if ever…

    Categories: Development Tags: communicationgood practice
  • Type-safe annotations

    Fishing net

    I don’t like dynamically-typed languages - scripting languages in other terms. Of course, I’ve to live with Javascript because it’s so ubiquitous on the web, but given the chance, I’d switch to TypeScript in an instant. I was amazed by a demo of the Griffon GUI framework, and have read some pretty good stuff about the Spock testing framework, but I barely looked at them because they are based on Groovy, a scripting language (even if it offers optional enforcement of types via annotations). This is not some idiosyncrasy: static typing helps the compiler to find errors. For me, costs of creating and maintaining a safety net just to ensure type correctness instead of letting the compiler handle it overweights most possible benefits.

    (I know this is a strong claim, and you’re welcome to challenge it if you’ve valid and factual arguments, but this is not the subject of this post.)

    Even within statically-typed languages, not everything is unicorns and rainbows. In Java, one of such gray area is the String class. Before Java 5 and its enum feature, string constants were used as enumeration values. It was a weak design point, as two different constants could hold the same value.

    // Pretty unsafe
    public static final String CHOICE1 = "choice";
    public static final String CHOICE2 = "choice";
    
    String string = ...;
    switch (string) {
        case CHOICE1: // do something
            break;
        case CHOICE2: // <- this will never get executed
            break;
    }
    

    enum finally set things right.

    // Safe
    public enum Choice {
        CHOICE1, CHOICE2
    }
    
    Choice choice = ...;
    switch (choice) {
        case CHOICE1: // do something
            break;
        case CHOICE2: // do something else
            break;
    }
    

    Java 5 finally brought safety. Yet, even in Java 8, some design flaw persist. One of them is found in annotations. Nothing world-threatening, but enough to make me curse every time I have to write the following:

    @SupressWarnings("deprecation")
    

    That’s the way one has to write it: annotation do not accept enum types! The type-safe way would have been the following:

    public enum SupressWarrningsType {
        Deprecation // And others
    }
    
    @SupressWarnings(SupressWarrningsType.Deprecation)
    

    The downside of this approach is that only set enumeration values are accepted and they are set in stone. Using string, additional strings can be added at any time.

    All is not lost, though. Even with the current state of things, one can benefit from type-safe annotations by just writing the following once per project (or in a company’s shared library):

    @SuppressWarnings("deprecation")
    public @interface SuppressWarningsDeprecation
    

    And then simply use it like that:

    @SuppressWarningsDeprecation
    

    Note that only two element values are mandatory: unchecked and deprecation are required by the Java Language Specification. Additional options are available, depending on the provider. For Oracle, type javac -X and look for the -Xlint: option to list them.

    For the laziest among us - including myself, I already created the library. The source code is hosted on Github and the binary on Bintray. It’s available under the friendly Apache License 2.0. Just add the following dependency:

    <dependency>
      <groupId>ch.frankel</groupId>
      <artifactId>safe-annotations</artifactId>
      <version>1.0.0</version>
    </dependency>
    

    Whether you decide to use it or bake your own, you should strive to use type-safety when possible - that includes annotations.

    Categories: Java Tags: typesafe