Devoxx 2012 – Day 2

November 13th, 2012 3 comments

– after a few hours of sleep and a fresh mind to boot, I’m ready to start the day with some Scala. Inside, I’m wondering how much time I’ll be able to keep on understanding the talk –

Scala advanced concepts by Dick Wall and Bill Venners

Implicit are the first subject tackled. First, know that implicit is one of the reasons that the Scala compiler has such a hard time doing its job. In order to illustrate implicit, a Swing anonymous class snippet code is shown:

val button = new JButton

button.addActionListener(new ActionListener {

	def actionPerformed(event: ActionEvent) {




button.addActionListener(_:ActionEvent) => println("pressed") // Compile time error with type mismatch

The implicit keyword is a way for the Scala compiler to try conversions from one type to another type until it compiles. The only thing that matters is the type signature, the name doesn’t. Since implicit functions are so powerful (thus dangerous), in Scala 2.10, you have to explicitlyimport language.implicitConversions. Moreover, other such features (e.g. macros) have to be imported in order to be used. If not imported, compiler issues warnings, but in future versions, it will be handled as errors.

Rules for using implicit are the following:

  • Marking rule: only defs marked as implicit are available
  • Scope rule: an implicit conversion must be in scopeas a single identifier (or be associated with either the source or target type of the conversion). This is something akin to inviting the implicit conversion. If the implicit is in the companion object, it will be automatically used. Another more rational alternative would be to put it in a trait, which may be in scope, or not.
  • One at a time rule: only a single implicit conversion must satisfy the compiler.
  • Explicit-first rule: if the code compiles without implicit, implicits won’t be used.
  • Finally, if Scala has more that one implicit conversion available, the compiler will choose the more specific one if it can. If it cannot, it will choose none and just throw an error.

Even though the compiler doesn’t care about the implicit def’s name, people reading your code do so that naming should be chosen for both explicit usage of the conversion and explicit importing into scope for the conversion. Note that some already available implicits are in the Predef object.

In Scala 2.10, implicit can set on classes, so that implicit are even safer. As a corollary, you don’t have to import the feature. Such classes must take exactly one parameter and must be defined where a method could be defined (so they cannot be top-level). Note that the generated method will have the same name as the class.

Implicit parameters can be used in curried defs and replace the last parameter.

def curried(a: Int)(implicit b: Int) = a + b

implicit val evilOne: Int = 1

curried(1)(2) // Returns 3 as expected

curried(1) // Returns 2 because evilOne is implicit and used in the def

Implicit parametes are best used as one of your own type you control: don’t use Int or String as implicit parameters. It also lets you set meaningful names on the types (such as Ordered in the Scala API).

By the way, in order to ease the use of you API, you can specify the @ImplicitNotFound annotation to provide a specific error message to the compiler. Also, when an implicit isn’t applied, the compiler doesn’t offer much debugging information. In this case, a good trick is to start explicitly using the parameter and see what happens. In conclusion, use implicits with care: ask yourself if there’s no other way (such as inheritance or trait).

Scala 2.10 also offers some way into duck typing through the applyDynamics(s:String) (which is also a feature that has to be imported in order to use). It looks much like reflection, but let compiler checks pass-through and fails at runtime instead at compile time (if the desired method is not present).

Now is time for some good Scala coding practices.

  • First, favor immutability: in particular, if you absolutely need to use a var, use the minimum scope possible. Also, mutable collections are particularly evil. If you need to use mutability in this context, it’s most important you do not expose it outside your method (in the signature).
  • For collections, consider map and flatMap above foreach: this tend to make you think in functional programming.
  • Then, don’t chain more than a few map/filter/… calls in order to make your code more readable. Instead, declare intermediate variables with meaningfulnames.
  • Prefer partial functions to bust out tuples: { case(w, i) => w.toList(i) } // better than { t => t._1.toList(t._2) } // not a good thing
  • Consider Eitheras alternative to exceptions since exceptions disturb the flow of your code.
  • The more if and foryou can eliminate from your code, the least you’ll have to take all cases in account. Thus, it’s better to favoir tail recursion and immutables over loops and mutables.
  • Have a look at Vector over List. The former is constant-time performance…
  • Know your collections!already began!
  • When developing publicmethods, always provide a return type. Not only it let you avoid runtime errors (it compiles because of type inference), it also nicely documents your code and makes it more readable.
  • Deeply think about case classes: you get already-defined methods (toString(), apply(), unapply(), etc.) for free.

– here my battery goes down–

Faster Websites: Crash Course on Frontend Performance by Iliya Grigorik

– Even though backend performance matters, I do tend to think, like the speaker, that in most “traditional” webapps, most of the time is spent on the frontend. I hope to increase my palette of tools available in case it happens –

The speaker is part of the Web Fast team inside of Google. The agenda is separated into three main parts: the problem, browser architecture and best practices, with context.

The problems

What’s the point of making fast websites? Google and Bing tried to measure the impact of delays, by injecting artificial delays at various points during the response. All metrics, including Revenue per User go down, and the more the delay, the sharper the decrease in a linear fashion. As a sidenote, once the delay is removed, metrics take time to go back to their original value. Another company did correlate the page load speed and the bounce rate. Mobile web is putting even more pressure on the load of the sites…

If you want to succeed with web-performance, don’t view it as a technical metric. Instead, measure and correlate its impact with business metrics

Delay User reaction
0-100ms Instant
100-300ms Feels sluggush
300-1000ms Machine is working
+1s Mental context switch
+10s I’ll come back later

Google Page Analytics show that on the desktop web, the median page load time is roughly 3ms while the mobile web, it’s 5ms (only taking in account those new Google phones). The problem is that an average page is made of 84 requests of is and weights about 1MB!

Let’s have a look at the life a HTTP request: DNS lookup, socket connect, request management itself and finally content download. Each phase in the lifecycle can be optimized.

  • For example, most DNS servers are pretty bad. That’s why Google provide free public DNS servers: and A project named namebench can help you find the best DNS servers, in part according to your browser history.
  • Socket connection can be reduced by moving the server closer. Alternatively, we could use CDN.

In the case of, there are only 67 requests but it’s a little less that 4Mb. Those requests are not only HTML, but also CSS, JavaScript and images. Requesting all these resources can be graphically represented as a waterfall. There are 2 ways to optimize the waterfall: making it shorter and thinner. Do not understimate the time taken in the frontend: on average, the frontend takes 86% of the time.

One could argue that improvements in network speed could save us. Most “advanced” countries have an average 5Mb/s speed, and is increasing. Unfortunately, bandwidth don’t matter that much, latency is much more important! Over 5Mb/s, increasing bandwidth won’t gonna help much Page Load Time (PLT) improvements, even though it will for downloading movies :-) The problem is that increasing bandwith is easy (just lay out more cables), while improving latency is expensive (read impossible) since it’s bound by the speed of light: it would require laying out shorter cables.

HTTP 1.1 is an improvement over HTTP 1.0 in that you can keep your connections alive: in the former version, one request required one connection, which is not the case in the later version. Now, this means that browsers open up to 6 TCP connections to request resources. This number is basically a limit, because of TCP slow start: it’s a feature of TCP that is meant to probe for network performance and not overload the network. In real-life, most HTTP traffic is composed of small, bursty, TCP flows because there are many requests, each of limited size (think JS files). A improvement would be to increase TCP initial congestion window, but it requires tweaking low-level system resources.

An ongoing work is HTTP 2.0! The good thing is that SPDY is already there and v2 is intented as a base for HTTP 2.0. HTTP 2.0 goals are making things better (of course), built on HTTP 1.1 and be extensible. Actual workarounds to increase PLT include concatenating files (CSS, JS), images spriting and domain sharding (in order to mislead the browser into making more than 6 connections because there are “fake” domains). SPDY is intended to remove these workarounds so that developers won’t have to sully their nails with dirty hacks anymore. You don’t need to modify your site to work with SPDY, but you can optimize it (stop spriting and sharding)

In essence, with SDPY, you open a single TCP connection and inside this single connection send multiple “streams”, which are multiplexed. SPDY also offers Server Push so that server can push resources to the client… so long as the client doesn’t cancel the stream. Note that inlining resources inside the main HTML page can be seen as a form of Server Push. Today, SPDY is supported across a number of browsers, including Chrome (of course), Firefox 13+ and Opera 12.10+. Additionnally, there are a number of servers that are compatible (Apache through the mod_spdy, nginx, Jetty, …).

The mobile web is exploding: mobile Internet users will reach desktop Internet users in 2 years. In countries like India, it’s already the case! There’s a good part of Internet users that are striclty accessing if from their mobile, the exact figure depends on the country, of course (25% in the US, 79% in Egypt). The heart of the matter is that though we don’t want to differenciate between desktop and mobile access, make to mistake: the physical layer is completely different. In fact, for mobile, the latency is much worse because exiting from idle state is time-consuming, both in 3G or in 4G (though numbers are reduced in the latter case).

There are some solutions to address those problems:

  • Think again SPDY
  • Re-use connections, pipeline
  • Download resources in bulk to avoid waking up the radio
  • Compress resources
  • Cache once you’ve got them!

Browser architecture

The good news is that the browser is trying to help you. Chrome, for example, has the following strategies: DNS prefetch (for links on the same page), TCP preconnect (same here), pooling and re-use TCP connections and finally caching (for resources already downloaded). More precisely, hovering the mouse over the link will likely fire DNS prefect and TCP preconnect. Many optimization parametes are available in Chrome, just go play with them (so long as you understand).

Google Analytics provides you with the means to get the above metrics when your site uses the associated script. An important feature is segmenting, to analyze the differences between users. As an example, the audience could be grouped into Japan, Singapore and US: it let us see that a peak PLT only for Singapore users was caused by the temporary unavailibility of a social widget, but only for the Singapore group.

Now, how does the browser render the bytes that were provided by the server? The HTML parser does a number of different things: create the DOM tree, as well as the CSS tree to create the Render tree. By the way, HTML5 specify how to parse the HTML, which was not the case in HTML 4: bytes are tokenized, tokens are built into the tree. However, JavaScript could change the DOM through doc.write so the browser must first manage scripts before doing anything. Consequence, sync scripts do block the browser; corollary, code async scripts, like so

(function()) {

	// do something here


In HTML5, there are more simpler ways to async scripts, with script tag attributes:

  • regular, which is just standard script
  • defer tells the browser to download in the background and execute the script in order
  • async tells it to also download in the background but execute it when ready

– 3 hours talk are really too much, better take a pause and come back aware than sleep my way through –

In order to optimize a page, you can install PageSpeed extension in Chrome, which will basically audit the page (– note it’s also available as a Firebug plugin –). Checks include:

  • Image compression optimization: even better, the optimized content is computed by PageSpeed so you could take it directly and replace your own.
  • Resizing to 0x0 pixels (yes, it seems to happen and enough there’s a rule to check for it)
  • Looking for uncompressed resources. Web servers can easily be put in place to use gzip compression without updating the site itself

Alternatively, you can use the online PageSpeed Insights

Performance best practices

  • Reduce DNC lookups
  • Avoid redirect
  • Make fewer HTTP requests
  • Flush the document early so as to help the document parser discover external resources early
  • Uses a CDN to decrease latency: the content has to be put the closest to your users
  • Reduce the size of your pages
    • GZIP your text assets
    • Optimize images, pick optimal format
    • Add an Expiresheader
    • Add ETags
  • Optimize for fast first render, do not block the parser
    • Place stylesheets at the top
    • Load scripts asynchronously
    • Place scripts at the bottom
    • Minify and concatenate
  • Eliminate memory leaks
    • Build buttery smooth pages (scroll included)
    • Leverage hardware acceleration where possible
    • Remove JS and DOM memory leaks
    • Test on mobile devices
  • Use the right tool for the job (free!)
    • Learn about Developer Tools
    • PageSpeed Insights
    • Test on mobile devices (again)

Reference: the entire slides are available there (be warned, there are more than 3)

Maven dependency puzzlers by Geoffrey de Smet

– seems like a nice change of pace, recreational and all… –

The first slide rejoins my experience: if you deploy a bad POM to a Maven repo, your users will feel the pain (– see last version of log4j for what not to do –)

– he, you can win beers! –

What happens when you add an artifact with different groupId?
Maven doesn’t know it’s the same artifact so it adds both. For users, this means you have to ban outdated artifacts, and exclude transitive dependencies. For providers, use relocation. In no case should you redistribute other project’s artifacts under your own!
What happens when you define a dependency’s version different from the dependency management’s version parent?
Maven overwrites the version. In order to use the parent’s version, don’t use any version in the project.
What happens when you define a version and you get a transitive dependency with a different version?
Maven correctly detects a conflict and uses the version from the POM with the least number of relations from the project. To fail fast, use the maven-enforcer-plugin and the enforce goal.
What happens when the number of relations is the same?
Maven chooses the version from the dependency declared first in the POM! In essence, this means dependencies are ordered; the version in itself plays no role, whereas a good strategy would be to retain the highest version.

– hopes were met, really fun! –

Junit rules by Jens Schauder

– though I’m more of a TestNG fanboy, it seems the concept of rules may tip the balance in favor of the latter. Worth a look at least –

Rules fixes a lot of problems you encounter when testing:

  • Failing tests due to missing tear down
  • Inheriting from multiple base test classes
  • Copy-paste code in tests
  • Running with different runners

When JUnit tries to run a test class, it finds all methods annotated with @Test and wraps them in “statement”. Rules are things that take a statement as a parameter and returns a statement. Creating a rule is as simple a creating a class inheriting from Rule and annotating it with @Rule.Thus, bundling setup and teardown code is as simple as creating a rule that does something before and after evaluating the statement.Rules can also be used to enforce a timeout, to run the test inside the Event-Dispatch thread, or even to change the environment. Additionally, rules can be used to create tests that are run only under specific conditions, for example to test Oracle-specific code in some environment and H2-specific code in other. Finally, rules may be used to provide services to test classes.

Once you start using rules, you’ll probably end up having more than one rule during in the test execution. As a standard, rules shouldn’t depend on each other, so order shouldn’t be a problem. However, real-life cases prove a need to have rules chaining: JUnit provides a RuleChain to achieve this.

Finally, if you need a rule to execute before a single run (something akin to @BeforeClass), you can use @RuleClass.

– IMHO, it seems rules are like TestNG listeners, which are available since ages. At least, I tried –

Categories: Event Tags:

Devoxx 2012 – Day 1

November 13th, 2012 No comments

The 2012 Devoxx Conference in Antwerp has begun. Here’s some sum-ups of the sessions I’ve been attending to. I’ll try to write one daily (unless I’m too lazy or tired, Devoxx has been known to have such consequences on geeks…).

–I followed no conference in the morning since I presented a hands-on lab on Vaadin 7

Modular JavaEE application in the cloud by B. Ertman and P. Bakker

Two current trends rule modern-day application: on one hand, applications tend to grow bigger, thus complex and agile development and refactoring become more common. Challenges that arise from these factors are manyfold, and include:

  • Dependency management
  • Versioning
  • Long-term maintenance
  • Deployment

When building applications for the cloud, there are a couple of non-functional requirements, such as 0-downtime and modular deployments. Once you got more customers, chances are you’ll end up having specific extensions for some, thus even more aggravating the previous challenges.

In university, we learned to promote high-cohesion and low-coupling systems for Object-Oriented systems. How can we lessen coupling in a system? The first step is to code by interface. Now, we need a mechanism to hide implementations, so as to prevent someone from accidentally using our implementtation classes. A module is a JAR extension component that provides such a feature (note that such feature is not available within current JVMs). Given a module, how can we instantiate a new hidden class? One possible answer can be dependency injection, another can be service registries.

Modularity is an architecture principle. You have to design with modularity in mind: it doesn’t come with a framework. You have to enforce separation of concerns in the code. At runtime, the module is the unit of modularity.

What we need is:

  • An architectural focus on modularity
  • High-level enterprise API
  • A runtime dynamic module framework: right now, OSGi is the de facto standard since Jigsaw is not available at the time of this presentation.

–Now begins the demo–

In order to develop applications, you’ll probably have to use the BndTools plugin under Eclipse. A good OSGi practice is to separate a stable API (composed of interfaces) in a module and implementations (of concrete classes) in other ones.

OSGi metadata is included in the JAR’s META-INF/MANIFEST file. In order to make the API module usable by other modules, we need to export the desired package. As for the implementation class, there’s no specific code, although we need a couple of steps:

  • Configure the modules available to the module implementation (just like traditional classpath management)
  • Create a binding between the interface and its implementation through an Activator. It can be done through Java code, through Java annotations or with XML.
  • Configure implementation and activator classes to be private

Finally, we just have to develop client code for the previous modules, and not forget to tell which OSGi container will be used at runtime.

In case of exceptions, we need a way to interact with the container. In order to achieve this, a console would be in order: the gogo console can be hot-deployed to the container since it’s an OSGi module by itself. Once done, we can start and stop specific modules.
–Back to the slides–
Are JavaEE and OSGi fated to be enemies? While it could have been the case before, nowadays they’re not exclusive. In essence, we have 3 options:

  1. Use OSGi core architecture and use JavaEE APIs as published services. It’s the approach taken by application servers just as GlassFish.
    • In this case, web applications become Web Application Bundles (the extension plays no part). Such modules are standard WAR files (JSF, JAX-RS, etc.) only for the Manifest. Extra-entries include the Web-Context-Path directive as well as the traditional OSGi Import-Packagedirective.
    • The goal of EJB bundles is to use EJB and JPA code as now: unfortunately, it’s not a standard yet. Manifest directive include Export-EJB.
    • For CDI, the target is to use CDI within bundles. It’s currently prototyped in Weld (CDI RI).

    All application servers use OSGi bundles under the cover, with supporting OSGi bundles as an objective in the near future. At this time, Glassfish and WebSphere have the highest level of support for that.

  2. When confronted with an existing JavaEE codebase but modularity is needed in a localized part of the application, it’s best to use an hybrid approach. In this case, separate your application into parts: for exameple, one ‘administrative’ part based on Java and the other ‘dynamic’ using OSGi glued together by an injection bridge. In the JavaEE part, the BundleContext is injected through a standard @Resource annotation. –Note that your code is now coupled to the OSGi API, which defeats the low-coupling approach professed by OSGi–.
  3. Finally, you could need modularity early in the development cycle and use OSGi ‘à la carte‘. This is a viable option if you don’t use many JavaEE APIs, for example only JAX-RS.

Note that for debugging purposes, it’s a good idea to deploy the Felix Dependency Manager module.

What happens when you have more than one implementation module of the same API module? There’s no way to be deterministic about that. However, there are ways to use it:

  • The first way is to use arbitrary properties on the API module, and only require such properties when querying the OSGi container for modules
  • Otherwise, you can rank your services so as to get the highest rank available. It’s very interesting to have fallback services in case the highest ranking is not available.
  • Finally, you can let you client code handle this so that it listens to service lifecycle events for available module implementations

It’s a good OSGi practice: keep modules pretty small in most cases, probably only a few classes. Thus, in a typical deployment, it’s not uncommon for the number of bundles to amount in hundreds.

Now comes the Cloud part (from the title): in order to deploy, we’ll use Apache ACE, a provisioning server. It defines features (modules), distributions (collections of modules) and targets (servers for distributions provisioning). By the way, ACE is not only built on top of OSGi but also uses Vaadin as its GUI (sorry, too good to miss). When a server registers on ACE (and is defined as a server), the latter will automatically deploy the desired distributions to it.

This method let us use auto-scaling: just use a script when a server is ready to register to the ACE server and it just works!


Weld-OSGI in action by Mathieu Ancelin and Matthieu Clochard

Key annotations of CDI (Context and Dependency Injection) are @Inject to request injection, @Qualifier to specify precise injection and @Observes to listen to CDI events.

OSGi is a modular and dynamic platform for Java: it’s stable and powerful but provides old API. In OSGi, bundles are the basis of the modularity, those are enhanced JARs. OSGi proposes three main features:

  • bundles lifecycle management
  • dependency management
  • version management

Weld-OSGi is a CDI extension that tries to combine the best of both worlds. It’s a JBoss Weld project, and is developed by the SERLI company. What’s really interesting is that you don’t need to know about OSGi, it’s wrapped under CDI itself. Basically, each time a bundle is deployed in the OSGi container, a new CDI context is created. Also, CDI beans are exposed as OSGi services.

Weld-OSGI provides some annotations to achieve that, such as @OSGIService to tell CDI you need an OSGi service. Alternatively, you could require all services implementations through the @Service annotation so as to use the desired implementation. On both, you could refine the injected services through qualifier annotations, as in “true” CDI.

OSGi containers generate a lot of events, each time a bundle changes state (e.g. when a bundle starts or stops). Those events can be observed through the use of the traditional @Observes annotation, with the addition of some specific Weld-OSGI annotations. A specific API is also available for events happening between bundles.

– demo is very impressive, the UI being changed on the fly, synced with the OSGi container –


Categories: Event Tags:

Scala cheatsheet part 1 – collections

November 11th, 2012 2 comments

As a follow-up of point 4 of my previous article, here’s a first little cheatsheet on the Scala collections API. As in Java, knowing API is a big step in creating code that is more relevant, productive and maintainable. Collections play such an important part in Scala that knowing the collections API is a big step toward better Scala knowledge.

Type inference

In Scala, collections are typed, which means you have to be extra-careful with elements type. Fortunaltey, constructors and companion objects factory have the ability to infer the type by themselves (most of the type). For example:

scala>val countries = List("France", "Switzerland", "Germany", "Spain", "Italy", "Finland")
countries: List1 = List(France, Switzerland, Germany, Spain, Italy, Finland)

Now, the countries value is of type List[String] since all elements of the collections are String.

As a corollary, if you don’t explicitly set the type if the collection is empty, you’ll have a collection typed with Nothing .

scala>val empty = List()
empty: List[Nothing] = List()

scala> 1 :: empty
res0: List[Int] = List(1)

scala> "1" :: empty
res1: List1 = List(1)

Adding a new element to the empty list will return a new list, typed according to the added element. This is also the case if a element of another type is added to a typed-collection.

scala> 1 :: countries
res2: List[Any] = List(1, France, Switzerland, Germany, Spain, Italy, Finland)

Default immutability

In Functional Programming, state is banished in favor of “pure” functions. Scala being both Object-Oriented and Functional in nature, it offers both mutable and immutable collections under the same name but under different packages: scala.collection.mutable and scala.collection.immutable. For example, Set and Map are found under both packages (interstingly enough, there’s a scala.collection.immutable.List but a scala.collection.mutable.MutableList). By default, collections that are imported in scope are those that are immutable in nature, through the scala.Predef companion object (which is imported implicitly).

The collections API

The heart of the matter lies in the API themselves. Beyond expected methods also found in Java (like size() and indexOf()), Scala brings to the table a unique functional approach to collections.

Filtering and partitioning

Scala collections can be filtered so that they return:

  • either a new collection that retain only elements that satisfy a predicate (filter())
  • or those that do not (filterNot())

Both take a function that takes the element as a parameter and return a boolean. The following example returns a collection which only retains countries whose name has more than 6 characters.

scala> countries.filter(_.length > 6)
res3: List1 = List(Switzerland, Germany, Finland)

Additionally, the same function type can be used to partition the original collection into a pair of two collections, one that satisfies the predicate and one that doesn’t.

scala> countries.partition(_.length > 6)
res4: (List1, List1) = (List(Switzerland, Germany, Finland),List(France, Spain, Italy))

Taking, droping and splitting

  • Taking a collection means returning a collection that keeps only the first n elements of the original one
    scala> countries.take(2)
    res5: List1 = List(France, Switzerland)
  • Droping a collection consists of returning a collection that keeps all elements but the first n elements of the original one.
    scala> countries.drop(2)
    res6: List1 = List(Germany, Spain, Italy, Finland)
  • Splitting a collection consists in returning a pair of two collections, the first one being the one before the specified index, the second one after.
    scala> countries.splitAt(2)
    res7: (List1, List1) = (List(France, Switzerland),List(Germany, Spain, Italy, Finland))

Scala also offers takeRight(Int) and dropRight(Int) variant methods that do the same but start with the end of the collection.

Additionally, there are takeWhile(f: A => Boolean) and dropWhile(f: A => Boolean) variant methods that respectively take and drop elements from the collection sequentially (starting from the left) while the predicate is satisfied.


Scala collections elements can be grouped in key/value pairs according to a defined key. The following example groups countries by their name’s first character.

res8: scala.collection.immutable.Map[Char,List1] = Map(F -> List(France, Finland), S -> List(Switzerland, Spain), G -> List(Germany), I -> List(Italy))

Set algebra

Three methods are available in the set algebra domain:

  • union (::: and union())
  • difference (diff())
  • intersection (intersect())

Those are pretty self-explanatory.


The map(f: A => B) method returns a new collection, which length is the same as the original one, and whose elements have been applied a function.

For example, the following example returns a new collection whose names are reversed.

res9: List[String] = List(ecnarF, dnalreztiwS, ynamreG, niapS, ylatI, dnalniF)


Folding is the operation of, starting from an initial value, applying a function to a tuple composed of an accumulator and the element under scrutiny. Considering that, it can be used as the above map if the accumulator is a collection, like so:

scala> countries.foldLeft(List[String]())((list, x) => x.reverse :: list)
res10: List[String] = List(dnalniF, ylatI, niapS, ynamreG, dnalreztiwS, ecnarF)

Alternatively, you can provide other types of accumulator, like a string, to get different results:

scala> countries.foldLeft("")((concat, x) => concat + x.reverse)
res11: java.lang.String = ecnarFdnalreztiwSynamreGniapSylatIdnalniF


Zipping creates a list of pairs, from a list of single elements. There are two variants:

  • zipWithIndex() forms the pair with the index of the element and the element itself, like so:
    scala> countries.zipWithIndex
    res12: List[(java.lang.String, Int)] = List((France,0), (Switzerland,1), (Germany,2), (Spain,3), (Italy,4), (Finland,5))

    Note: zipping with index is very important when you want to use an iterator but still want to have a reference to the index. It keeps you from declaring a variable outside the iteration and incrementing the former inside the latter.

  • Additionally, you can also zip two lists together:
    scala>"Paris", "Bern", "Berlin", "Madrid", "Rome", "Helsinki"))
    res13: List[(java.lang.String, java.lang.String)] = List((France,Paris), (Switzerland,Bern), (Germany,Berlin), (Spain,Madrid), (Italy,Rome), (Finland,Helsinki))

Note that the original collections don’t need to have the same size. The returned collection’s size will be the min of the sizes of the two original collections.

The reverse operation is also available, in the form of the unzip() method which returns two lists when provided with a list of pairs. The unzip3() does the same with a triple list.


I’ve written this article in the form of a simple fact-oriented cheat sheet, so you can use it as such. In the next months, I’ll try to add other such cheatsheets.

To go further:

I’ve found the following references around the web:

Categories: Development Tags:

My view on Coursera’s Scala courses

November 5th, 2012 2 comments

I’ve spent my last 7 weeks trying to follow Martin Odersky’s Scala courses on the Coursera platform.

In doing so, my intent was to widen my approach on Functional Programming in general, and Scala in particular. This article sums up my personal thoughts about this experience.

Time, time and time

First, the courses are quite time-consuming! The course card advises for 5 to 7 hours of personal work a week and that’s the least. Developers familiar with Scala will probably take less time, but other who have no prior experience with it will probably have to invest as much.

Given that I followed the course during my normal work time, I can assure you it can be challenging. People who also followed the course confirmed this appreciation.

Functional Programming

I believe the course allowed me to put the following Functional Programming principles in practice:

  • Immutable state
  • Recursivity

Each assignment was checked for code-quality, specifically for mutable state. Since in Scala, mutable variables have to be defined with the var keyword, the check was easily enforced.


I must admit I only received the barest formal computer programming education. I’ve picked up the tricks of the trade only from hard-won experience, thus I’ve only the barest algorithmics skills.

The offered Scala course clearly required much needed skills in this area and I’m afraid I couldn’t fulfill some assignments because of these lackings.

Area of improvement

Since I won’t code any library or framework in Scala anytime soon, I feel my next area of improvement will be focused on the whole Scala collections API.

I found I missed a lot of knowledge of these API during my assignments, and I do think improving this knowledge will let me code better Scala applications in the future.

What’s next

At the beginning, I aimed to have 10/10 grade in all assignments but in the end, I only succeeded to achieve these in about half of them. Some reasons for this have been provided above. To be frank, it bothers the student part in me… but the more mature part sees this as a way to improve myself. I won’t be able to get much further, since Devoxx takes place the following week in Antwerp (Belgium). I’ll try to write about the conferences I’ll attend to or I’ll see you there: in the later case, don’t miss out my hands-on lab on Vaadin 7!

Categories: Development Tags: ,

The problem lies not in the technology

October 14th, 2012 1 comment

Let’s face it, I’m a technologist: I will gladly debate with you for hours on the merits of Java vs Scala, SQL vs NoSQL, Google vs Apple or what have you.

But in fact, none of this matters. Statistics show that more than half of software projects are failures; I’ve been doing software projects in various positions for more than 10 years and root cause for failures was never the technology.

In my carreer, I’ve identifed these top 3 sources for project failures (in no particular order):

Lack of decision
As an architect, I try to provide for different solutions that more or less meet the requirements, each one with its associated cost and risks. What I expect in return is a clear decision from management in regard to those alternatives. When higher-ups eschew their decision-making responsibility, projects lack vision and are doomed to fail sooner or later, but probably sooner.
No expressed business needs
Legacy waterfall methodologies require a formalized specification document, while more modern agile methodologies insist on having the product owner available on site. In both cases, however, the goal is for IT to know what to develop. A requirement of a successful project is for the business users to be able to work on their needs: express them, define them, refine them. Sure signs of future failure are talks like “we don’t have time to write it down” or “it’s evident”. While the former is understandable – the business also has a business to run, so to speak – both show a lack of commitment that where the only logical conclusion will be rejection by the end-users.
Projects have to be realistic. More precisely, people should be asked in how much time they expect to finish a task and the project planning has to be based on that. When it’s the other way around, you’d better prepare for lack of success. Meeting deadlines is not easy when planning is realistic: sometimes, it’s as if everything on Earth conspires to slow you down. But with unrealistic deadlines from the start, there are only so few tasks that can be run in parallel, so pressure will begin to rise to attain unreachable goals, motivation will decrease and at some point, the PM will have to provide for plan B: a realistic planning.

So, instead of trying to search for better technologies, we should sometimes focus on those subjects and try to remove those obstacles from our projects, even if they don’t belong to our sphere of interest. To be frank, as an engineer, I was not prepared to tackle those challenges: they are completely unrelated to technologies but only once they are eliminated can we again focus on our core business, building better applications.

Categories: Development Tags:

PowerMock, features and use-cases

October 7th, 2012 No comments

Even if you don’t like it, your job sometimes requires you to maintain legacy applications. It happened to me (fortunately rarely), and the first thing I look for before writing as much as a single character in a legacy codebase is a unit testing harness. Most of the time, I tend to focus on the code coverage of the part of the application I need to change, and try to improve it as much as possible, even in the total lack of unit tests.

Real trouble happens when design isn’t state-of-the-art. Unit testing requires mocking, which is only good when dependency injection (as well as initialization methods) makes classes unit-testable: this isn’t the case with some legacy applications, where there is a mess of private and static methods or initialization code in constructors, not to mention static blocks.

For example, consider the following example:

public class ExampleUtils {

    public static void doSomethingUseful() { ... }

public class CallingCode {

    public void codeThatShouldBeTestedInIsolation() {



It’s impossible to properly unit test the codeThatShouldBeTestedInIsolation() method since it has a dependency on another unmockable static method of another class. Of course, there are proven techniques to overcome this obstacle. One such technique would be to create a “proxy” class that would wrap the call to the static method and inject this class in the calling class like so:

public class UsefulProxy {

    public void doSomethingUsefulByProxy() {


public class CallingCodeImproved {

    private UsefulProxy proxy;

    public void codeThatShouldBeTestedInIsolation() {



Now I can inject a mock UsefulProxy and finally test my method in isolation. There are several drawbacks to ponder, though:

  • The produced code hasn’t provided tests, only a way to make tests possible.
  • While writing this little workaround, you didn’t produce any tests. At this point, you achieved nothing.
  • You changed code before testing and took the risk of breaking behavior! Granted, the example doesn’t imply any complexity but such is not always the case in real life applications.
  • You made the code more testable, but only with an additional layer of complexity.

For all these reasons, I would recommend this approach only as a last resort. Even worse, there are designs that are completely closed to simple refactoring, such as the following example which displays a static initializer:

public class ClassWithStaticInitializer {

    static { ... }

As soon as the ClassWithStaticInitializer class is loaded by the class loader, the static block will be executed, for good or ill (in the light of unit testing, it probably will be the latter).

My mock framework of choice is Mockito. Its designers made sure features such as static method mocking weren’t available and I thank them for that. It means that if I cannot use Mockito, it’s a design smell. Unfortunately, as we’ve seen previously, tackling legacy code may require such features. That’s when enters PowerMock (and only then – using PowerMock in a standard development process is also a sure sign the design is fishy).

With PowerMock, you can leave the initial code untouched and still test to begin changing the code with confidence. Here’s the test code for the first legacy snippet, using Mockito and TestNG:

public class CallingCodeTest {

    private CallingCode callingCode;

    protected void setUp() {


        callingCode = new CallingCode();

    public IObjectFactory getObjectFactory() {

        return new PowerMockObjectFactory();

    public void callingMethodShouldntRaiseException() {


        assertEquals(getInternalState(callingCode, "i"), 1);

There isn’t much to do, namely:

  • Annotate test classes (or individual test methods) with @PrepareForTest, which references classes or whole packages. This tells PowerMock to allow for byte-code manipulation of those classes, the effective instruction being done in the following step.
  • Mock the desired methods with the available palette of mockXXX() methods.
  • Provide the object factory in a method that returns IObjectFactory and annotated with @ObjectFactory.

Also note that with the help of the Whitebox class we can access the class internal state (i.e. private variables). Even though this is bad, the alternative – taking chance with the legacy code without test harness is worse: remember our goal is to lessen the chance to introduce new bugs.

Features list of PowerMock is available here for Mockito. Note that suppressing static blocks is not possible with TestNG right now.

You can find the sources for this article here in Maven format.

Categories: Java Tags: , , ,

Use local resources when validating XML

September 30th, 2012 1 comment

Depending of you enterprise security policy, some – if not most of your middleware servers have no access to Internet. It’s even worse when your development infrastructure is isolated from the Internet (such as banks or security companies). In this case, validating your XML against schemas becomes a real nightmare.

Of course, you could set the XML schema location to a location on your hard drive. But about about your co-workers then? They would have to have the schema in exactly the same filesystem hierarchy, and that wouldn’t solve your problem about the production environment…

XML catalogs

XML catalogs are the nominal solution for your quandary. In effect, you plug in a resolver that knows how to map an online location to a location on your filesystem (or more precisely a location to another one). This resolver is set through the xml.catalog.files system property that may be different for each developer or in the production environment:


Now, when validation happens, instead of looking up the online schema URL location, it looks up the local schema.


XML catalogs can be used with JAXP, either SAX, DOM or STaX. The following details the code how to use them with SAX.

SAXParserFactory factory = SAXParserFactory.newInstance();


XMLReader reader = factory.newSAXParser().getXMLReader();

// Set XSD as the schema language
reader.setProperty("", "");

// Use XML catalogs
reader.setEntityResolver(new CatalogResolver());

InputStream stream = getClass().getClassLoader().getResourceAsStream("person.xml");

reader.parse(new InputSource(stream));

Two lines are important:

  • In line 13, we tell the reader to use XML catalogs
  • But line 7 is almost as important: we absolutely have to use the reader, not the SAX parser to parse the XML or validation will be done online!

Use cases

There are basically two use cases for XML catalogs:

  1. As seen above, the first use-case is to validate XML files against cached local schemas files
  2. In addition to that, XML catalogs can also be used as an alternative to LSResourceResolver (as seen previously in XML validation with imported/included schemas)

Important note

Beware: the CatalogResolver class is available in the JDK in an internal com.sun package, but the Apache XML resolver library does the job. In fact, the internal class is just a repackaging of the Apache one.

If you use Maven, the dependency is the following:


Beyond catalogs

Catalog are a file-system based standard way to map an online schema location.

However, this isn’t always the best strategy to choose from. For example, Spring provides its different schemas inside JARs, along classes and validation is done against those schemas.

In order to do the same, replace line 13 in the code above, and instead of a CatalogResolver, use your own implementation of EntityResolver.

You can find the source for this article here (note that associated tests use a custom security policy file to prevent network access, and if you want to test directly inside your IDE, you should reuse the system properties used inside the POM).

Categories: Java Tags:

Why I enrolled in an online Scala course

September 23rd, 2012 2 comments

When I heard that the Coursera online platform offered free Scala courses, I jumped at the opportunity.

Here are some reasons why:

  • Over the years, I’ve been slowly convinced that whatever the language you program in your professional life, learning new languages is an asset as it change the way you design your code.
    For example, the excellent LambdaJ library gave me an excellent overview of how functional programming can be leveraged to ease manipulation of collections in Java.
  • Despite my pessimistic predictions toward how Scala can penetrate mainstream enterprises, I still see the language as being a strong asset in small companies with high-level developers. I do not wish to be left out of this potential market.
  • The course if offered by Martin Oderski himself, meaning I get data directly from the language creator. I guess one cannot hope for a better teacher than him.
  • Being a developer means you’ve to keep in touch with the way the world is going. And the current trend is revolutions everyday. Think about how nobody speak about Big Data 3 or 4 years ago. Or how at the time, you developed your UI layer with Flex. Amazing how things are changing, and changing faster and faster. You’d better keep the rythm…
  • The course is free!
  • There’s the course, of course, but there are also a weekly assignment, each being assigned a score. Those assignments fill the gap in most online courses, where you lose motivation with each passing day: this way, the regular challenge ensures a longer commitment.
  • Given that some of my colleagues have also enrolled in the course, there’s some level of competition (friendly, of course). This is most elating and enables each of us not only to search for a solution, but spend time looking for the most elegant one.
  • The final reason is that I’m a geek. I love learning new concepts and new ways to do things. In this case, the concept is functional programming and the way is Scala. Other alternatives are also available: Haskell, Lisp, Erlang or F#. For me, Scala has the advantage of being able to be run on the JVM.

In conclusion, I cannot recommend you enough to do likewise, there are so many reasons to choose from!

Note: I also just stumbled upon this Git kata; an area where I also have considerable progress to make.

Categories: Development Tags: ,

Web Services: JAX-WS vs Spring

September 15th, 2012 No comments

In my endless search for the best way to develop applications, I’ve recently been interested in web services in general and contract-first in particular. Web services are coined contract-first when the WSDL is designed in the first place and classes are generated from it. They are acknowledged to be the most interoperable of web services, since the WSDL is agnostic from the underlying technology.

In the past, I’ve been using Axis2 and then CXF but now, JavaEE provides us with the power of JAX-WS (which is aimed at SOAP, JAX-RS being aimed at REST). There’s also a Spring Web Services sub-project.

My first goal was to check how easy it was to inject through Spring in both technologies but during the course of my development, I came across other comparison areas.


With Spring Web Services, publishing a new service is a 3-step process:

  1. Add the Spring MessageDispatcherServlet in the web deployment descriptor.
  2. Create the web service class and annotate it with @Endpoint. Annotate the relevant service methods with @PayloadRoot. Bind the method parameters with @RequestPayload and the return type with @ResponsePayload.
    public class FindPersonServiceEndpoint {
        private final PersonService personService;
        public FindPersonServiceEndpoint(PersonService personService) {
            this.personService = personService;
        @PayloadRoot(localPart = "findPersonRequestType", namespace = "")
        public FindPersonResponseType findPerson(@RequestPayload FindPersonRequestType parameters) {
            return new FindPersonDelegate().findPerson(personService, parameters);
  3. Configure the web service class as a Spring bean in the relevant Spring beans definition file.

For JAX-WS (Metro implementation), the process is very similar:

  1. Add the WSServlet in the web deployment descriptor:
  2. Create the web service class:
    @WebService(endpointInterface = "")
    public class FindPersonSoapImpl extends SpringBeanAutowiringSupport implements FindPersonPortType {
        private PersonService personService;
        public FindPersonResponseType findPerson(FindPersonRequestType parameters) {
            return new FindPersonDelegate().findPerson(personService, parameters);
  3. Finally, we also have to configure the service, this time in the standard Metro configuration file (sun-jaxws.xml):

Creating a web service in Spring or JAX-WS requires the same number of steps of equal complexity.

Code generation

In both cases, we need to generate Java classes from the Web Service Definitions File (WSDL). This is completely independent from the chosen technology.
However, whereas JAX-WS uses all generated Java classes, Spring WS uses only the ones that maps to WSD types and elements: wiring the WS call to the correct endpoint is achieved by mapping the request and response types.

The web service itself

Creating the web service in JAX-RS is just a matter of implementing the port type interface, which contains the service method.
In Spring, the service class has to be annotated with @Endpoint to be rcognized as a service class.

URL configuration

In JAX-RS, the sun-jaxws.xml file syntax let us configure very finely how each URL is mapped to a particular web service.
In Spring WS, no such configuration is available.
Since I’d rather have an overview of the different URLs, my preferences go toward JAX-WS.

Spring dependency injection

Injecting a bean in a Spring WS is very easy since the service is already a Spring bean thanks to the part.
On the contrary, injecting a Spring bean requires our JAX-WS implementation to inherit from SpringBeanAutowiringSupport which prevent us from having our own hierarchy. Also, we are forbidden from using explicit XML wiring then.
It’s easier to get dependency injection with Spring WS (but that was expected).

Exposing the WSDL

Both JAX-WS and Spring WS are able to expose a WSDL. In order to do so, JAX-WS uses the generated classes and as such, the exposed WSDL is the same as the one we designed in the first place.
On the contrary, Spring provides us with 2 options:

  • either use the static WSDL in which it replaces the domain and port in the section
  • or generate a dynamic WSDL from XSD files, in which case the designed one and the generated one aren’t the same

Integration testing

Spring WS has one feature that JAX-WS lacks: integration testing. A test class that will be configured with Spring Beans definitions file(s) can be created to assert sent output messages agains known inputs. Here’s an example of such a test, based on both Spring test and TestNG:

@ContextConfiguration(locations = { "classpath:/spring-ws-servlet.xml", "classpath:/applicationContext.xml" })
public class FindPersonServiceEndpointIT extends AbstractTestNGSpringContextTests {

    private ApplicationContext applicationContext;

    private MockWebServiceClient client;

    protected void setUp() {

        client = MockWebServiceClient.createClient(applicationContext);

    public void findRequestPayloadShouldBeSameAsExpected(String request, String expectedResponse) throws DatatypeConfigurationException {

        int id = 5;
        Source requestPayload = new StringSource(request);

        Source expectedResponsePayload = new StringSource(expectedResponse);

        String request = ""
            + id + "";

        GregorianCalendar calendar = new GregorianCalendar();

        XMLGregorianCalendar xmlCalendar = DatatypeFactory.newInstance().newXMLGregorianCalendar(calendar);


        String expectedResponse = ""
            + id
            + "JohnDoe"
            + xmlCalendar.toXMLFormat()
            + "";


Note that I encountered problems regarding XML suffixes since not only is the namespace checked, but also the prefix name itself (which is a terrible idea).

Included and imported schema resolution

On one side, JAX-WS inherently resolves included/imported schemas without a glitch.
On the other side, we need to add specific beans in the Spring context in order to be able to do it with Spring WS:



JAX-WS provides an overview of all published services at the root of the JAX-WS servlet mapping:

Adress Informations
Service name : {}FindPersonSoapImplService
Port name : {}FindPersonSoapImplPort
Adress : http://localhost:8080/wsinject/jax/findPerson
WSDL: http://localhost:8080/wsinject/jax/findPerson?wsdl
Implementation class :


Considering contract-first web services, my little experience has driven me to choose JAX-WS in favor of Spring WS: testing benefits do not balance the ease-of-use of the standard. I must admit I was a little surprised by these results since most Spring components are easier to use and configure than their standard counterparts, but results are here.

You can find the sources for this article here.

Note: the JAX-WS version used is 2.2 so the Maven POM is rather convoluted to override Java 6 native JAX-WS 2.1 classes.

Categories: JavaEE Tags: , ,

Lessons learned from integrating jBPM 4 with Spring

September 9th, 2012 1 comment

When I was tasked with integrating a process engine into one of my projects, I quickly decided in favor of Activiti. Activiti is the next version of jBPM 4, is compatible with BPMN 2.0, is well documented and has an out-of-the-box module to integrate with Spring. Unfortunately, in a cruel stroke of fate, I was overruled by my hierarchy (because of some petty reason I dare not write here) and I had to use jBPM. This articles tries to list all lessons I learned in this rather epic journey.

Lesson 1: jBPM documentation is not enough

Although jBPM 4 has plenty of available documentation, when you’re faced with the task of starting from scratch, it’s not enough, especially when compared to Activiti’s own documentation. For example, there’s no hint on how to bootstrap jBPM in a Spring context.

Lession 2: it’s not because there’s no documentation that it’s not there

jBPM 4 wraps all needed components to bootstrap jBPM through Spring, even though it stays strangely silent on how to do so. Google is no helper here since everybody seems to have infered its own solution.For the curious, here’s how I finally ended doing it:
For example, here’s how I found some declared the jBPM configuration bean:


At first, I was afraid to use a class in a package which contained the dreaded “internal” word but it seems the only way to do so…

Lesson 3: Google is not really your friend here

… Was it? In fact, I also found this configuration snippet:


And I think if you search long enough, you’ll find many other ways to achieve engine configuration. That’s the problem when documentation is lacking: we developers are smart people, so we’ll find a way to make it work, no matter what.

Lesson 4: don’t forget to configure the logging framework

This one is not jBPM specific, but it’s important nonetheless. I lost much (really much) time because I foolishly ignored the message warning me about Log4J not finding its configuration file. After having created the damned thing, I finally could read an important piece of information I receveid when programmatically deploying a new jPDL file:

WARNING: no objects were deployed! Check if you have configured a correct deployer in your jbpm.cfg.xml file for the type of deployment you want to do.

Lesson 5: the truth is in the code

This lesson is a direct consequences of the previous one: since I already had double-checked my file had the jbpm.xml extension and that the jBPM deployer was correctly set in my configuration, I had to understand what the code really did. In the end, this meant I had to get the sources and debug in my IDE to watch what happened under the cover. The culprit was the following line:

repositoryService.createDeployment().addResourceFromInputStream("test", new FileInputStream(file));

Since I used a stream on the file, and not the stream itself, I had to supply a fictitious resource name (“test”). The latter was checked for a jpdl.xml file extension and of course, failed miserably. This was however not readily apparent by reading the error message… The fix was the following:


Of course, the referenced file had the correct jbpm.xml extension and it worked like a charm.

Lesson 6: don’t reinvent the wheel

Tightly coupled with lesson 2 and 3, I found many snippets fully describing the jbpm.cfg.xml configuration file. Albeit working (well, I hope so since I didn’t test it), it’s overkill, error-prone and a perhaps sign of too much Google use (vs brain use). For example, the jbpm4-spring-demo project published on Google Code provides a full-fledged configuration file. With a lenghty trial-and-error process, I managed to achieve success with a much shorter configuration file that reused existing configuration snippets:


Lesson 7: jPBM can access the Spring context

jBPM offers the java activity to call some arbitrary Java code: it can be EJB but we can also wire the Spring context to jBPM so as to make the former accessible by the latter. It’s easily done by modifying the former configuration:



Note that I haven’t found an already existing snippet for this one, feedback welcome.

Lesson 8: use the JTA transaction manager if needed

You’ll probably end having a business database along your jBPM database. In most companies, you’ll be gently asked to put them in different schemas by the DBAs. In this case, don’t forget to use a JTA transaction manager along some XA datasources to use 2-phase commit. In your tests, you’d better use the same schema and a simple transaction manager based on the data source will be enough.

For those of you that need yet another way to configure their jBPM/Spring integration, here are the sources I used in Maven/Eclipse format. I hope my approach is a lot more pragmatic. In all cases, remember I’m a newbie in this product.

Categories: Java Tags: ,