Devoxx France 2013 – Day 2
Object and Functions, conflict without a cause by Martin Odersky
The aim of Scala is to merge features of Object-Programming and Functional Programming. The first popular OOP language was Simula in 67, aimed at simulations; the second one was Smalltalk for GUIs. What is the reason OOP became popular: only because of the things you could do., not because of its individual features (like encapsulation). Before OOP, the data structure was well known with an unbounded number of operations while with OOP, the number of operations is fixed but the number of implementation is unbounded. Though it is possible for procedural languages (such as C) to apply to the field of simulation & GUI, it is to cumbersome to develop with them in real-life projects.
FP has advantages over OOP but none of them is enough to led to mainstream adoption (remember it has been around for 50 years). What can spark this adoption is the complexity to develop OOP applications multicores and cloud computing ready. Requirements for these scopes include:
In each of these, mutable state is a huge liability. Shared mutable state and concurrent threads leads to non-determinism. To avoid this, just avoid mutable state or at least reduce it.
The essence of FP is to concentrate on transformation of immutable values instead of stepwise updates of a single mutable data structure.
In Scala, the
.par member turns a collection into a parallel collection. But then, you have to became FP and forego of any side-effects. With
Promise, non-blocking is also possible but is hard to write (and read!), while Scala for-expressions syntax is an improvement. It also make parallel calls very easy.
Objects are not to put put away: in fact, they are not about imperative, they are about modularization. There are no module systems (yet) that are on par with OOP. It feels like using FP & OOP is like sitting between two chairs. Bridging the gap require letting go of some luggage first.
Objects are characterized by state, identity and behavior
It would be better to focus on behavior…
HTML5 opens new capabilities that were previously the domain of native applications (local storage, etc.). However, it is not stable and mature yet: know that it will have a direct impact on development costs.
Mandatory features for offline include application cache and local storage. Application cache is a way for browsers to store files locally to use when offline. It is based on a manifest file, and has to be referenced by desired HTML pages (in the
<html> tag). A cache management API is provided to listen to cache-related events. GWT already manages resources: we only need to provide a linker class to generate the manifest file that includes wanted resources. Integration of the cache API is achieved through usual JSNI usage [the necessary code is not user-friendly... in fact, it is quite gory].
Local storage is a feature that stores user data on the client-side. Some standards are available: WebSQL, IndexedDB, LocalStorage, etc. Unfortunately, only the latter is truly cross-browser and is based on a key-value map of strings. Unlike application cache, there’s an existing out-of-the-box GWT wrapper around local storage. Objects stored being strings and running client-side, JSON is the serialization mechanism of choice. Beware that standard mandates 5MB maximum of storage (while some browsers provide more).
- Offline authentication
- A local database to be able to run offline
- JPA features for developers
- Transparent data synch when coming online again for users
In regard to offline authentication, it is not a real problem. Since local storage is not secured, we just have to store the password hash. Get a SHA-1 Java library and GWT takes care of the rest.
State sync between client and server is the final problem. It can be separated into 3 ascending complexity levels: read-only, read-add and read-add-delete-update. Now, sync has to be done manually, only the process itself is generic. In the last case, there are no rules, only different conflict resolution strategies. What is mandatory is to have causality relations data (see Lamport timestamps).
Conclusion is that developing offline applications now is a real burden, with a large mismatch between possible HTML5 capabilities and existing tools.
Comparing JVM web frameworks by Matt Raible
Starting with JVM Web frameworks history, it all began with PHP 1.0 in 1995. In the J2EE world, Struts replaced proprietary frameworks in 2001.
Are there many too many Java web frameworks? Consider Vaadin, MyFaces, Struts2, Wicket, Play!, Stripes, tapestry, RichFaces, Spring MVC, Rails, Sling, Stripes, Grails, Flex, PrimeFaces, Lift, etc.
And now, for SOFEA architecture, there are again so many frameworks on the client-side: Backbone.ja, AngularJS, HTML5, etc. But, “traditional” frameworks are still relevant because of client-side development limitations, including development speed and performance issues.
In order to make relevant decision when faced with a choice, first set your goals and then evaluate each option in regard to these goals. Pick your best option and then re-set your goals. Maximizers trie to make the best possible choice, satisficers try to find the first suitable choice. Note that the former are generally more unhappy than the latter.
Here is a proposed typology (non-exhaustive):
|Pure web||Full stack||SOFEA|
The former matrix with fine-grained criteria is fine, but you probably have to create your own, with your own weight for each criterion. There are so many ways to tweak the comparison: you can assign more fine-grained grades, compare performances, locs, etc. Most of the time, you are influenced by your peers and by people who have used such and such frameworks. Interestingly enough, performance-oriented tests show that most of the time, bottlenecks appear in the database.
- For full stack, choose by language
- For pure web, Spring MVC, Struts 2, Vaadin, Wicket, Tapestry, PrimeFaces. Then, eliminate further by books, job trends, available skills (i.e. LinkedIn), etc.
Fun facts: a great thing going for Grails and Spring MVC is backward compatibility. On the opposite side, Play! is the first framework that has community revive a legacy version.
Conclusion: web frameworks are not the productivity bottleneck (administrative tasks are as show in the JRebel productivity report), make your own opinion, be nether a maximizer (things change too quickly) nor a picker.
Puzzlers and curiosities by Guillaume Tardif & Eric Lefevre-Ardant
[Interesting presentation on self-reference, in art and code. Impossible to resume in written form! Wait for the Parleys video...]
Exotic data structures, beyond ArrayList, HashMap & HashSet by Sam Bessalah
If all you have is a hammer, everything looks like a nail
In some cases, problem can be solved in an easier way by using the right data structure instead of the one we know. Those 4 different “exotic” data structures are worth knowing:
- Skip lists are ordered data sets. The benefit of skip lists over array lists is that every operation (insertion, removal, contains and retrieval, ranges) is in o(log N). It is achieved by adding extra levels for “express” lines. Within the JVM, it is even faster with JVM region localizing feature. The type is non-locking (thread-safe) and included in Java 6 with
ConcurrentSkipListSet. The former is ideal for cache implementations.
- Tries are ordered trees. Whereas traditional trees have complexity of o(log N) where N is the tree depth, tries have constant time complexity whatever the depth. A specialized kind of trie is the Hash Array Mapped Trie (HAMT), a functional data structure for fast computations. Scala offers
CTriestructure, a concurrent trie.
- Bloom filters are probabilistic data structures, designed to return very fast whether an element belongs to a data structure. In this case, there are no false negatives, accurately returning when an element does not belong to the structure. On the contrary, false positives are possible: it may return true when it is not the case. In order to reduce the probability of false positives, one can choose an optimal hash function (cryptographic functions are best suited), in order to avoid collision between hashed values. To go further, one can add hash functions. The trade off is memory space consumption.
Because of collisions, you cannot remove elements from Bloom filters. In order to achieve them, you can enhance Bloom filters with counting, where you also store the number of elements at a specific location.
- Count Min Sketches are advanced Bloom filters. It is designed to work best when working with highly uncorrelated, unstructured data. Heavy hitters are based on Count Min Sketches.