Archive

Archive for the ‘Development’ Category

Managing web sessions

March 8th, 2010 Nicolas Frankel No comments

In the previous article, I set up a cluster of 2 Tomcat instances in order to achieve load-balacing. It also offered failover capability. However, when using this feature, the user session was lost when changing node. In this article, I will show you how this side-effect can be avoided.

Reminder: the HTTP protocol is inherently disconnected (as opposed to FTP which is connected). In HTTP, the client sends a request to a server, it gets its response and that’s the end. The server cannot natively associate a request to a previous request made by the same client. In order to use HTTP for applications purpose, we needed to group such requests. Session is a label for this grouping feature.

This is done through a token, passed to the client on the first request. The token is passed with a cookie if possible, or appended to the URL if not. Interestingly enough, this is the same for PHP (token named PHPSESSID) as for JEE (token named JSESSIONID). In both case, we use a stateless protocol and tweak it so that it appears stateful. This pseudo-statefulness make possible application-level features, such as authentication/authorization or shopping cart.

Now, let’s take a real use case. I’m browsing through an online shop. I have already put some article in my cart. When I decide to finally go to the payment, I find myself with an empty basket! What happened? Unbeknownst to me, the node which hosted my session crashed and I was transparently rerouted to a working node, without my session ID, and thus, without the content of my cart.

Such thing can not happen in real life, since such a shop will probably be put out of business if this happens too often. There are basically 3 strategies to adopt in order to avoid such loss.

Sessions are evil

From time to time, I stumble upon articles blaming sessions and labeling them as evil. While not an universal truth, using sessions in a bad way can have negative side-effects.

The most representative bad usage of session if putting everything in them. I’ve seen lazy developers put collections in session in order to manage paging. You pass a statement once, put the result in the session, and manage paging on the sessionized result. Since collection’s size is not constrained, such use does not scale well with the number of users increasing. In most cases, everything goes fine in development but you’re soon overwhelmed by unusual response time or even OutOfMemoryError in production.

If you think sessions are evil, some solutions are:

  • Store data in the database: now your data has to go from the front-end to the back-end to be saved, and then back again to be used. Classic relational database may not be the solution you’re looking for. Most high-traffic low-response time take the NoSQL route, although I don’t know if they use it for session storage purpose
  • Store data on the client side through cookies: your clients needs to have cookie-enabled browser. Furthermore, data will be sent with every request/response so don’t overuse too much

The second option has the advantage of freeing you of session ID management. However, for both solution, your application code needs to implement the storage part.

Besides, nothing stops you from using sessions and using the extension points of your application server to use cookie storage instead of the default behaviour (mostly in memory). I wouldn’t recommend that though.

Server session replication

Another solution is to embrace session – this is a JEE feature after all – but to use session replication in order to avoid session data loss. Session replication is not a JEE feature. It is a proprietary feature, offered by many (if not all) application servers that is entirely independent from your code: your code uses session, and it is magically replicated by the server across cluster nodes.

There are two constraints common to session replication amongst all servers:

  • Use the <distributable /> tag in the web.xml
  • Only put in session instances of classes that are java.lang.Serializable

IMHO, these rules should be enforced on all web applications, whether currently deployed on a cluster or not, since they are not very restrictive. This way, deploying an application on a cluster will tends toward a no-operation.

Strategies available for session replication are application server dependent. However, they are usually based on the following implementations:

  • In memory replication: each server stores all servers session datat. When updating, it broadcasts to all nodes the modified session (or the delta, based on the strategy available/used). This implementation heavily uses the network and the memory
  • Database persistence
  • File persistence: the file system used should be available to all cluster nodes

For our simple example, I will just show you how to use in-memory session replication in Tomcat. The following are the steps one should take in order to do so. Please notice that one should first undertake what is described in the Tomcat clustering article.

Note: Tomcat 5.5 has the cluster configuration commented in server.xml. Tomcat 6 does not. Here is the default clustering configuration:

<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
  managerClassName="org.apache.catalina.cluster.session.DeltaManager"
  expireSessionsOnShutdown="false"
  useDirtyFlag="true"
  notifyListenersOnReplication="true">

  <Membership className="org.apache.catalina.cluster.mcast.McastService"
    mcastAddr="228.0.0.4"
    mcastPort="45564"
    mcastFrequency="500"
    mcastDropTime="3000"/>

  <Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"
    tcpListenAddress="auto"
    tcpListenPort="4001"
    tcpSelectorTimeout="100"
    tcpThreadCount="6"/>

  <Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"
    replicationMode="pooled"
    ackTimeout="15000"
    waitForAck="true"/>

  <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve"
    filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

  <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer"
    tempDir="/tmp/war-temp/"
    deployDir="/tmp/war-deploy/"
    watchDir="/tmp/war-listen/"
    watchEnabled="false"/>

  <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/>
</Cluster>

First, make sure that the mcastAddrand mcastPort of the <Membership> tag are the same. This is validated when starting a second node with the following log in the first:

INFO: Replication member added:org.apache.catalina.cluster.mcast.McastMember[tcp
://10.138.97.43:4001,catalina,10.138.97.43,4001, alive=0]

This insures that all nodes of a cluster are able to communicate with each other. From this point on, considering all other configuration is left by default, sessions are replicated in memory in all nodes of the cluster. Thus, you don’t need sticky session anymore. This is not enough for failover though, since removing a node from the cluster will still lead a new request to be assigned a new session ID, thus preventing access to your previous session data.

In order to also route session IDs, you need to specify two additional tags in <Cluster>:

<Valve className="org.apache.catalina.cluster.session.JvmRouteBinderValve" enabled="true" sessionIdAttribute="takeoverSessionid"/>
<ClusterListener className="org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener" />

The valve redirects requests to another node with the previous session ID. The cluster listener receives session ID cluster change event. Now, removing a cluster node is seamless (apart from latency for redirected) for clients whose session ID was redirected to this node.

Last minute note: the previous setup uses the standard session manager. I was recently made aware of a third-party manager that also handles session cookies when a node fails, thus reducing the configuration hassle. Such product is Memcached Session Manager and is based on Memcached. Any feedback on the use of this product is welcome.

Third party session replication

The previous solution has the disadvantage of specific server configuration. Though it does not impact development, it needs to be done for every server type in a different manner. This could be a burden if you happen to have different server types in your enterprise.

Using third party products is a remedy to this. Terracotta is such a product: morevoer, by providing a set number of Terracotta nodes, you avoid broadcasting your session changes to all server nodes, like in the Tomcat replication previous example.

In the following, the server is the Terracotta replication server and the clients are the Tomcat instances. In order to set up Terracotta, two steps are mandatory:

  • create the configuration for the server. In order to be used, name it tc-config.xml and put it in the bin directory. In our case, this is it:
    <tc:tc-config xmlns:tc="http://www.terracotta.org/config"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-5.xsd">
    ...
        <modules>
          <module name="tim-tomcat-5.5"/>
        </modules>
    ...
          <web-applications>
    	    <!-- The webapp(s) to watch -->
            <web-application>servlets-examples</web-application>
    
          </web-applications>
    ...
    </tc:tc-config>

    Note: the default installed module is for Tomcat 6. In case you need the Tomcat 5.5 module, you have to launch the Terracotta Integration module Management (TIM) and use it to download the correct module. For users behind an Internet proxy, this is made possible by updating tim-get.properties with the following lines:
    org.terracotta.modules.tool.proxyUrl = …
    org.terracotta.modules.tool.proxyAuth = …

  • add to all client launch scripts the following lines:
    TC_INSTALL_DIR=
    
    TC_CONFIG_PATH=:
    
    . ${TC_INSTALL_DIR}/bin/dso-env.sh -q
    
    export JAVA_OPTS="$JAVA_OPTS $TC_JAVA_OPTS"

Now we’re back to session replication and failover but the configuration is usable across different application servers.

In order to see what is stored, you can also launch the Terracotta Developer Console.

Conclusion

There are 3 basic strategies to manage session failover over cluster nodes. The first one is not to use session at all: it has major consequences on your development time, since your application has to do it directly. The second one is to look at the server documentation to look how it is done with a specific server. This ties your session management to a single product. Last but not least, you can use a third-party product. This has the advantage to move the configuration outside the scope of your specific server, thus letting you move with less hassle from one server to the next and still enjoy the benefits of session failover. Were I a system engineer, this is the solution I would recommend since it is the most flexible.

To go further:

Categories: Development Tags: , ,

Hibernate hard facts – Part 5

March 1st, 2010 Nicolas Frankel No comments

In the fifth article of this serie, I will show you how to manage logical DELETE in Hibernate.

Most of the time, requirements are not concerned about deletion management. In those cases, common sense and disk space plead for physical deletion of database records. This is done through the DELETE keyword in SQL. In turn, Hibernate uses it when calling the Session.delete() method on entities.

Sometimes, though, for audit or legal purposes, requirements enforce logical deletion. Let’s take a products catalog as an example. Products regularly go in and out of the catalog. New orders shouldn’t be placed on outdated products. Yet, you can’t physically remove product records from the database since they could have been used on previous orders.

Some strategies are available in order to implement this. Since I’m not a DBA, I know of only two. From the database side, both add a column, which represents the deletion status of the record:

  • either a boolean column which represent either active or deleted status
  • or, for more detailed information, a timestamp column which states when the record was deleted; a NULL meaning the record is not deleted and thus active

Managing logical deletion is a two steps process: you have to manage both selection so that only active records are returned and deletion so that the status marker column is updated the right way.

Selection

A naive use of Hibernate would map this column to a class attribute. Then, selecting active records would mean a WHERE clause on the column value and deleting a record would mean setting the attribute and calling the update() method. This approach has the merit of working. Yet, it fundamentally couples your code to your implementation. In the case you migrate your status column from boolean to timestamp, you’ll have to update your code everywhere it is used.

The first thing you have to do to mitigate the effects of such a migration is to use a filter.

Hibernate3 provides an innovative new approach to handling data with “visibility” rules. A Hibernate filter is a global, named, parameterized filter that can be enabled or disabled for a particular Hibernate session.

Such filters can the be used throughout your code. Since the filtering criteria is thus coded in a single place, updating the database schema has only little incidence on your code. Back to the product example, this is done like this:

@Entity
@FilterDef(name = "activeProducts")
@Filter(name = "activeProducts", condition = "DELETION_DATE IS NULL")
public class Product {

  @Id
  @Column(nullable = false)
  @GeneratedValue(strategy = AUTO)
  private Integer id;

...
}

Note: in the attached source, I also map the DELETION_DATE on an attribute. This is not needed in most cases. In mine, however, it permits me to auto-create the schema with Hibernate.

Now, the following code will filter out logically deleted records:

session.enableFilter("activeProducts");

In order to remove the filter, either use Session.disableFilter() or use a new Session object (remember that factory.getCurrentSession() will probably use the same, so factory.openSession() is in order).

Deletion

The previous step made us factorize the “select active-only records” feature. Logically deleting a product is still coupled to your implementation. Hibernate let us decouple further: you can overload any CRUD operations on entities! Thus, deletion can be overloaded to use an update of the right column. Just add the following snippet to your entity:

@SQLDelete(sql = "UPDATE PRODUCT SET DELETION_DATE=CURRENT_DATE WHERE ID=?")

Now, calling session.delete() on an Product entity will produce the updating of the record and the following output in the log:

UPDATE PRODUCT SET DELETION_DATE=CURRENT_DATE WHERE ID=?

With CRUD overloading, you can even suppress the ability to select inactive records altogether. I wouldn’t recommend this approach however, since you wouldn’t be able to select inactive records then. IMHO, it’s better to stick to filters since they can be enabled/disabled when needed.

Conclusion

Hibernate let you loosen the coupling between your code and the database so that you can migrate from physical deletion to logical deletion with very localized changes. In order to do this, Hibernate offers two complementary features: filters and CRUD overloading. These features should be part of any architect’s bag of tricks since they can be lifesavers, like in previous cases.

You can find the sources of this article here in Maven/Eclipse format.

To go further:

Categories: Java Tags:

Clustering Tomcat

February 24th, 2010 Nicolas Frankel No comments

In this article, I will show you how to use Apache/Tomcat in order to set up a load balancer. I know this has been done a zillion time before, but I will use this setup in my next article (teaser, teaser) so at least I will have it documented somewhere.

Apache Tomcat is the reference JSP/container since its inception. Despite a lack of full JEE support, it certainly has its appeal. The reasons behind using a full-featured commercial JEE application server are not always technical ones. With lightweight frameworks such as Spring being mainstream, it is not unusual to think using Tomcat in a production environment. Some companies did it even before that.

When thinking production, one usually think reliability and scalability. Luckily, both can be attained with Apache/Tomcat through the set up of a load-balancing cluster. Reliability is thus addressed so that if a Tomcat fails, following requests can be directed to a working Tomcat. Requests are dispatched to each Tomcat according to a predefined strategy. If the need be, more Tomcat can be added at will in order to scale.

In the following example, I will set up the simplest clustering topology possible: an Apache front-end that balances 2 Tomcat instance on the same physical machine.

Set up Apache

The first step is to configure Apache to forward your requests to Tomcat. There are basically 2 options in order to do this (I ruled out the pre-shipped load-balancer webapp):

  • use mod_jk, the classic Apache/Tomcat module
  • use mod_proxy, another Apache module

I’m not a system engineer, so I can’t decide on facts whether to use one or the other: I will use mod_jk since I’ve already used it before.

  • Download the mod_jk that is adapted to your Apache and Tomcat versions
  • Put it in the ‘modules’ folder of your Apache installation
  • Update your httpd.conf configuration to load it with Apache
    LoadModule jk_module modules/mod_jk-1.2.28-httpd-2.2.3.so
  • Configure Apache. Put these directive in the httpd.conf:
    JkWorkersFile	conf/worker.properties
    JkShmFile	logs/mod_jk.shm
    JkLogLevel	info
    JkLogFile logs/mod_jk.log
    JkMount		/servlets-examples/* lb

This configuration example is minimal but needs some comments:

Parameter Description
JkWorkersFile Where to look for the module configuration file (see below)
JkShmFile Where to put the shared memory file
JkLogLevel Module log level (debug/error/info)
JkLogFile Log file location. It is the default but declaring it avoid the Apache warning
JkMount Which url pattern will be forwarded to which worker

Since mod_jk can be used in non-clustered setups, there could be any JkMount, each forwarding to its own worker (see below). In our case, it means any request beginning with /servlets-examples/ (the trailing slash is needed) will be forwarded to the ‘lb’ worker .

Configure the workers

Workers are destination routes as viewed by Apache. They’re are referenced by an unique label in the httpd.conf and parameteirzed under the same label in the worker.properties file.
My workers.properties is the following:

worker.list=lb

worker.worker1.port=8010
worker.worker1.host=localhost
worker.worker1.type=ajp13

worker.worker2.port=8011
worker.worker2.host=localhost
worker.worker2.type=ajp13

worker.lb.type=lb
worker.lb.balance_workers=worker1,worker2

I define 3 workers in this file: lb, worker1 and worker2. The ‘lb’ worker is the load-balancing worker: it is virtual and it balances the latter two. Both are configured to point to a real Tomcat instance.

Now, with the Apache configuration in mind, we see that requests beginning with /servlets-examples/ will be managed by the load balancer worker which will in turn forward to a random worker.

Note: one can also put weight on workers hosted by more powerful machines so that these are more heavily loaded than less powerful ones. In our case, both are hosted on the same machine so it has no importance whatsoever.

Configure the Tomcat instances

The last step consist of the configuration of Tomcat instances. In order to do so, I shamelessly copied entire Tomcat installations (I’m on Windows). While editing the server.xml of the Tomcat instances, three points are worth mentioning:

  • The Engine tag has a jvmRoute attribute. It’s value should be the same as the worker’s name used in both httpd.conf and worker.properties. Otherwise, sessions will be recreated for each request
  • Look out for duplicated port numbers if all Tomcat instances are on the same machine. For example, use an incremental rule to configure every stream on a different port
  • Be sure that the tcpListenPort attribute of the Receiver is unique across all Tomcat instances

Use it!

With the previous set up, one can now start both Tomcat and Apache, then browse to the servlet-examples webapp, and more precisely to the Session page. Look there for Tomcat 5.5 and there for Tomcat 6. The servlet-example page page displays the associated session ID:

ID de Session: 324DAD12976045D197435033A67C025D.worker2
Crée le: Tue Feb 23 23:15:13 CET 2010
Dernier accès: Tue Feb 23 23:31:47 CET 2010

Notice that on my Tomcat instance, the worker’s name is part of the session ID.

If everything went fine, two interesting things should take place: first, when refreshing the page, the session ID should not change because of the sticky session (enabled by default). Morevoer, if I shutdown the Tomcat instance associated with the worker (the second in my case), and if I try to refresh the page, I still can access my application, but under a new session.

Thus, I lose all the information I stored under my session! In my following article, I will study how on can try to remedy to this.

To go further:

Categories: JEE Tags: ,

Free online SVN repositories

February 22nd, 2010 Nicolas Frankel 3 comments

This week, I searched for free online SVN repositories for closed-source projects. Of course, there are plenty of sites offering services for OpenSource projects: Google Code, SourceForge, etc. It may seem amazing, but everybody does not necessarily want to expose its code to the world.

From what I’ve found, there aren’t so many free SVN repositories around. Since it’s free and provided as a courtesy (not mentioning a thought toward upgrading to priced services later), there are some limits. Criteria are mainly:

  • The amount of disk space available
  • The number of projects you can store
  • The number of user accounts you can create in order to access the SVN

This is my results table and it definitely not exhaustive:

Name Mb Projects Users Notes
Unfuddle 200 1 2
ProjectLocker 500 Unlimited 5
Origo ? ? ? Origo is an OpenSource project in itself and provides both the product so you can host it yourself and online services. I couldn’t find any information on online limits though
XP-Dev 200 2 Unilimited I used it once but found to my dismay they erased my account because of months of inactivity
Beanstalk 100 ? 3

From a purely numerical point-of-view, and paying features aside, my pick will surely be ProjectLocker. This is not to say the others are bad but the free account has also some very interesting features:

  • Unlimited Bandwidth Transfer
  • SSL Encryption (very unusual for a free service)
  • Redundant RAID storage and Nightly Backups

I would definitely be interested in knowing what free Subversion provider do you use for your closed-source projects.

Disclaimer: I haven’t had any contact from any person working from ProjectLocker. I’m not an employee of ProjectLocker. The following only reflects my unbiased opinion.

Hibernate hard facts – Part 4

February 17th, 2010 Nicolas Frankel No comments

In the fourth article of this serie, I will show the subtle differences between get() and load() methods.

Hibernate, like life, can be full of suprises. Today, I will share one with you: have you ever noticed that Hibernate provides you with 2 methods to load a persistent entity from the database tier? These two methods are get(Class, Serializable) and load(Class, Serializable) of the Session class and their respective variations.

Strangely enough, they both have the same signature. Strangely enough, both of their API description starts the same:

Return the persistent instance of the given entity class with the given identifier.

Most developers use them indifferently. It is a mistake since, if the entity is not found, get() will return null when load() will throw an Hibernate exception. This is well described in the API:

Return the persistent instance of the given entity class with the given identifier, assuming that the instance exists. You should not use this method to determine if an instance exists (use get() instead). Use this only to retrieve an instance that you assume exists, where non-existence would be an actual error.

Truth be told, the real difference lies elsewhere: the get() method returns an instance, whereas the load() method returns a proxy. Not convinced? Try the following code snippet:

Session session = factory.getCurrentSession();

Owner owner = (Owner) session.get(Owner.class, 1);

// Test the class of the object
assertSame(owner.getClass(), Owner.class);

The test pass, asserting that the owner’s class is in fact Owner. Now, in another session, try the following:

Session session = factory.getCurrentSession();

Owner owner = (Owner) session.load(Owner.class, 1);

// Test the class of the object
assertNotSame(owner.getClass(), Owner.class);

The test will pass too, asserting that the owner’s class is not Owner. If you spy the object in the debugger, you’ll see a Javassist proxyed instance and that fields are not initialized! Notice that in both cases, you are able to safely cast the instance to Owner. Calling getters will also return expected results.

Why call the load() method then? Because since it is a proxy, it won’t hit the DB until a getter method is called.

Moreover, these features are also available in JPA from the EntityManager, respectively with the find() and getReference() methods.

Yet, both behaviours are modified by Hibernate’s caching mechanism. Try the following code snippet:

// Loads the reference
session.load(Owner.class, 1);

Owner owner = (Owner) session.get(Owner.class, 1);

According to what was said before, owner’s real class should be the real McCoy. Dead wrong! Since Hibernate previously called load(), the get() looks in the Session cache (the 1st level one) and returns a proxy!

The behaviour is symmetrical with the following test, which will pass although it’s counter-intuitive:

// Gets the object
session.get(Owner.class, 1);

// Loads the reference, but looks for it in the cache and loads
// the real entity instead
Owner owner = (Owner) session.load(Owner.class, 1);

// Test the class of the object
assertSame(owner.getClass(), Owner.class);

Conclusion: Hibernate does a wonderful job at making ORM easier. Yet, it’s not an easy framework: be very wary for subtle behaviour differences.

The sources for the entire hard facts serie is available here in Eclipse/Maven format.

Categories: Java Tags: , , , ,

Context root tweaking

February 1st, 2010 Nicolas Frankel No comments

JEE never ceases to amaze me. Even when I think I’m on top and I know all there’s to know about webapps, I’m in for a surprise. Good news is, whatever you think you know about a subject, there’s still room for one more fact. Bad news is, I’m deeply disturbed by what I learned.

Fact is: web applications context root can contain the / characte, meaning an URL such as http://localhost/multipart/context can refer to the root of the multipart/context webapp as well as to the context servlet mapping of the http://localhost/multipart webapp. When I was told that, my first reaction was disbelief. I immediately hurried to run a few tests on the JOnAS and JBoss application servers and it confirmed that was entirely possible.

In fact, if you look at the application.xml XML Schema, you see that the context-root is of a type that extends string (with an id) :

This means the context root can effectively contains anything, including slashes, backslashes and what have you.

The J2EE 1.4 specifications does not enforce additional constraints (p 125) :

Each web module must be given a distinct and non-overlapping name for its context root. [...] See the servlet specification for detailed requirements of context root naming.

I did not see additional requirements in the Servlet 1.4 specifications for the context root.

Tomcat even takes this tweak into account when creating Context with individual XML files. It calls it “multi-level context paths” :

In individual files (with a “.xml” extension) in the $CATALINA_HOME/conf/[enginename]/[hostname]/ directory. The name of the file (less the .xml) extension will be used as the context path. Multi-level context paths may be defined using #, e.g. foo#bar.xml for a context path of /foo/bar.

So, in order to deploy the context.war under the multipart/context root, you’ll have to name the context XML file multipart#context.xml.

The content of the file is the following:

<?xml version='1.0' encoding='utf-8'?>
<Context docBase="${catalina.home}/context.war" />
<!-- this means the context.war should be available at the root of Tomcat -->

In the tests I’ve made, if the multi-level context path shadows a classic context-root with a servlet mapping, the former takes precedence. I do not advise using this since the case is not specified in the specification, it could be handled differently from product to product.

OK, now we’ve established these facts, what is the point in knowing this, aside from setting you as the übergeek in a JEE geek convention? Since http://localhost/multipart/context and http://localhost/multipart are clearly separated web applications, they do not even share context.

In my case, that was the solution for a simple use-case. Imagine monitoring web applications: you know the URL you have to monitor. Now a product, developped in-house for diagnostics purpose, is deployed side-by-side with each webapp in webapp form. It would be very nice if you could reach the diagnostics webapp without looking at the documentation to know its URL. Let’s say it is the business webapp’s URL appended with /diag.

So, if the main webapp’s URL is http://myserver.com/mywebapp and the person in charge of monitoring knows about it, he knows he has to access http://myserver.com/mywebapp/diag and he gets what he wants! On the development side, it means both webapps are different products, are developed by different teams and have different lifecycles.

Categories: JEE Tags: , , ,

Typology of Rich Client technologies

January 22nd, 2010 Nicolas Frankel 2 comments

Chances are, if you are an software architect and need to make applications available over the network, you are asking yourself what client technology to use. In this article, I will try to list the solutions used previously and then elaborate on the solutions available today.

History

At the beginning, applications were solely deployed on a big machine. Everything happened there. To use the software, you needed to use a poor user interface aptly named a passive terminal. It output text: if you were lucky, you had colors! It was the one tier era.

The next phase started when the cost of  micro-computers went down. Every enterprise wanted to have some. The software industry evolved and moved the applications from the mainframe to the individual computers. It is a move that is still not finished – some companies still have mainframes – but the proof of this is that not many enterprises are migrating from clustered servers to mainframes. Data stayed on the server though: this was the 2-tier era.

Anyway, as software began to flourish on the user’s computer, so did the problems. At first, it was only technical problems: since hard drive space was limited, software vendors toyed with the ability to share libraries among applications. Thus started the infamous DLL hell that sometimes persist today. But the real problems appeared when the number of machines grew exponentially: the deployment of an application on 1′000 or 10′000 computers became a project in itself.

These disadvantages (and the associated growing costs) made some people think lateraly. What if the software was installed on a single node and the user accessed this node through the network? This could solve both the DLL hell and the deployment costs problems. Some devious minds even had the technology to do that just at hand: HTML over HTTP. It was the 3-tier era and we still live in it. Even though you can always add more tier, it doesn’t change the fundamental paradigm.

Limits of the HTML over HTTP model

HTML and HTTP have fundamental flaws for use as an application interface, though. HTML’s real motivation is to navigate through a text document’s maze and HTTP’s only goal is to transmit HTML over the network.

Thus, use of these technologies in ways that were not originally intended now show their limits:

  • browsers are not compatible. That’s sometimes the case not only between two versions of the same browsers but between the same version on two different operating systems
  • the complexity and the costs are transferred from the deployment phase to the development phase. A JEE developer have to know the following stack: Java, Servlet, JSP, HTML, JavaScript and CSS, just to name a few
  • the document centric model of HTTP forces the reload of the entire page, even if just a few elements have changed. The network speed becomes the critical factor
  • users complain about the limited user interactions. Ever tried to filter a combo box in pure HTML?

With the help of Microsoft, AJAX* came to the rescue of the last two problems: AJAX let you update part(s) of the page without a complete reload and give you access to widgets that propose professional user interaction.

Unfortunately, AJAX aggravates the first two problems: what is the support of this new technology among the browsers? Creating the XMLHTTPRequest is a matter or whether the browser is Internet Explorer or not. Even worse, the technology stacks one more component now. You could argue that using an AJAX framework can ease development but the problem stays the same, one more component whatsoever.

For those of you that may think HTML 5 as the panacea, I will respectfully remind you that the HTML 4 specification is dated 1999 and that the HTML 5 specification is still in draft: 10 years in our trade is likely to kill any good idea before its birth. But that’s a post for another time ; back on track, please.

Typology

That said, today’s is flooded with plenty of Rich Client technologies aimed at enhancing or replacing our familiar technologies. I will present here only technologies that keep the single point of deployment paradigm (clustering excluded), meaning applications should be able to update (or be updated) quickly and painlessly.

The following picture is a shot at classifying presentation layer technologies in the Java ecosystem. Notice that some may be used in other ecosystems too since they are technology neutral from a server point of view.

Words of advice:

  • this typology tries to dispatch very different technologies into cleanly separate categories. As such, some are more artificial than others. Moreover, some techs may legitimately belong to more than one category
  • don’t hold it against me if your favorite tech isn’t shown above. I have not a complete knowledge of a very moving and innovating landscape

Evolutions

On the right wing, we see technologies that represent a real evolution of the client. They all run over a virtual machine installed on the client computer.

From there, the main difference is whether the application is displayed through the browser or directly. The former are still somehow tied to the browser, such as Java Applets**, Flex that use the Flash plugin or even Firefox plugins. Each of them has a standalone equivalent.

The latter represent a new frame of mind where applications are clearly distinct from sites and are available separately from the tool used to browse sites. Apart from Air that is inspired by the JVM but with higher level considerations, two technologies are worth noting:

  • The JNLP file format, which is an external deployment descriptor available on the web for any Java application. Implementations include Java Web Start, Sun’s solution and an OpenSource alternative which is no more maintained, NetX. Both have the ability to read the JNLP in order to download, deploy and run the described application. They both use the JVM
  • Eclipse Rich Client Platform, which is a basis to create applications upon. Such applications can have the ability to update themselves virtually for free: only declarative use of such a feature is needed. Eclipse RCP also uses a JVM to run

Enhancement

On the left wing, we have technologies that want to enhance the current HTTP – HTML – DOM – ECMAScript – CSS stack. This branch is dependent on the browser’s implementation of the previous technologies specifications, especially ECMAScript. The main subdivision criterion is how the files are generated:

  • the first branch manipulates DOM on the client side. This is the domain of ECMAScript with products such as jQuery, Dojo and others.
  • the second branch creates HTML and JavaScript on the server side at runtime. Those frameworks are nowadays clearly component oriented and such is the realm of JSF
  • the third branch also creates HTML and JavaScript on the server side but at compile time. This is the original approach of GWT

Choice criteria

With all these technologies available, I think now is the right time for an enterprise to consider what rich client tech to use since the HTML paradigm is at an end and there are plenty (too much?) to choose from.

Five or so years ago, when doing Java development, you automatically choosed Struts. Nowadays, the question needs to be asked. But first, what criteria shall be used in order to evaluate the solution(s)? IMHO, these criteria shouldn’t all be technically oriented. The provider or the licence type are equally important because the durability is at the heart of the matter: some IS are built to last with 10 years in mind! The solution chosen will become the company’s standard, so it is a matter of investment and nobody likes wasting money, shareholders even less.

*In the rest of the post, AJAX will be the catchall word for both true AJAX and AJAJ (with JSON instead of XML)
**It it my personal opinion that applets, despite technical drawbacks, were years ahead of their time and for this reason, didn’t catch on. The Apache Pivot project is a framework built upon Applet

Categories: Development Tags: ,

My first try at Flex

January 21st, 2010 Nicolas Frankel No comments

Flex is now approaching its 4th version. Even since its start, it looked promising. Until some time ago, I didn’t look much into it for it was not Open enough. Two years ago, in order to use Flex application freely, you had to limit yourself to a little subset of Flex.

This has changed. At the end of last year, though, my client gave me the opportunity to prototype a simple application and I dived into it. This development gave me an insight into BlazeDS, Spring BlazeDS and Flex itself. Of course, I realized I barely scratched the surface of it. In order to go further into the user interface itself, I created my resume in Flex.

This screen (I can’t call it an application) has the following features:

  • Resource loading: I made the screen so that the resume is available outside it. I use the URLLoader object and reacts according to success / failure events
  • XML parsing: the resume is available in XML format and each part of the screen uses part of it
  • HTML formatting. Since the XML Schema I use is Europass, I cannot do what I want. In order to display things the way I want them, I had to clutter the XML with HTML tags and use the HTML formatting through Flex
  • Internationalization: the feature is done, I just have to translate the resume itself in English (no small feat for me) and German (will need help!)
  • Mavenization: the entire build is done with the help of Maven and the excellent FlexMojos. The latter let me build Flex without the need of FlexBuilder
  • Components: I used modular components so that the resume displayes at the center of the screen can be shared (provided you supply the resume with valid XML). Anyone interested?

Due to my lack of knowledge, the implementation of only these took me some time. Yet, I’m very positive about the technology.

I only discovered a problem lately: not all the classes documented in the FlexDocs are available freely. Some, such as the AdvancedDataGrid, are not included in framework.swc and are only usable for a fee. Even though I’m French (and as such labeled as a communist in the US), I understand Adobe’s strategy to make some components free and some not. IBM does the same with Eclipse plugins. What I find devious is that the documentation is aggregated and does not make such disctinction.

Anyway, my resume is here.

http://livedocs.adobe.com/flex/3/langref/index.html
Categories: Development Tags: ,

Discover Spring authoring

December 29th, 2009 Nicolas Frankel No comments

In this article, I will describe a useful but much underused feature of Spring, the definition of custom tags in the Spring beans definition files.

Spring namespaces

I will begin with a simple example taken from Spring’s documentation. Before version 2.0, only a single XML schema was available. So, in order to make a constant available as a bean, and thus inject it in other beans, you had to define the following:

<bean id="java.sql.Connection.TRANSACTION_SERIALIZABLE"
    class="org.springframework.beans.factory.config.FieldRetrievingFactoryBean" />

Spring made it possible, but realize it’s only a trick to expose the constant as a bean. However, since Spring 2.0, the framework let you use the util namespace so that the previous example becomes:

<util:constant static-field="java.sql.Connection.TRANSACTION_SERIALIZABLE"/>

In fact, there are many namespaces now available:

Prefix Namespace Description
bean http://www.springframework.org/schema/bean Original bean schema
util http://www.springframework.org/schema/util Utilities: constants, property paths and collections
util http://www.springframework.org/schema/jee JNDI lookup
lang http://www.springframework.org/schema/lang Use of other languages
tx http://www.springframework.org/schema/tx Transactions
aop http://www.springframework.org/schema/aop AOP
context http://www.springframework.org/schema/context ApplicationContext manipulation

Each of these is meant to reduce verbosity and increase readibility like the first example showed.

Authoring

What is still unknown by many is that this feature is extensible that is Spring API provides you with the mean to write your own. In fact, many framework providers should take advantage of this and provide their own namespaces so that integrating thier product with Spring should be easier. Some already do: CXF with its many namespaces comes to mind but there should be others I don’t know of.

Creating you own namespace is a 4 steps process: 2 steps about the XML validation, the other two for creating the bean itself. In order to illustrate the process, I will use a simple example: I will create a schema for EhCache, the Hibernate’s default caching engine.

The underlying bean factory will be the existing EhCacheFactoryBean. As such, it won’t be as useful as a real feature but it will let us focus on the true authoring plumbing rather than EhCache implementation details.

Creating the schema

Creating the schema is about describing XML syntax and more importantly restrictions. I want my XML to look something like the following:

<ehcache:cache id="myCache" eternal="true" cacheName="foo"
maxElementsInMemory="5" maxElementsOnDisk="2" overflowToDisk="false"
diskExpiryThreadIntervalSeconds="18" diskPersistent="true" timeToIdle="25" timeToLive="50"
memoryStoreEvictionPolicy="FIFO">
  <ehcache:manager ref="someManagerRef" />
</ehcache:echcache>

Since I won’t presume to teach anyone about XML, here’s the schema. Just notice the namespace declaration:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://blog.frankel.ch/spring/ehcache" xmlns="http://blog.frankel.ch/spring/ehcache"
  elementFormDefault="qualified">
  <xsd:complexType name="cacheType">
    <xsd:sequence maxOccurs="1" minOccurs="0">
      <xsd:element name="manager" type="managerType" />
    </xsd:sequence>
    <xsd:attribute name="id" type="xsd:string" />
    <xsd:attribute name="cacheName" type="xsd:string" />
    <xsd:attribute name="diskExpiryThreadIntervalSeconds" type="xsd:int" />
    <xsd:attribute name="diskPersistent" type="xsd:boolean" />
    <xsd:attribute name="eternal" type="xsd:boolean" />
    <xsd:attribute name="maxElementsInMemory" type="xsd:int" />
    <xsd:attribute name="maxElementsOnDisk" type="xsd:int" />
    <xsd:attribute name="overflowToDisk" type="xsd:boolean" />
    <xsd:attribute name="timeToLive" type="xsd:int" />
    <xsd:attribute name="timeToIdle" type="xsd:int" />
    <xsd:attribute name="memoryStoreEvictionPolicy" type="memoryStoreEvictionPolicyType" />
  </xsd:complexType>
  <xsd:simpleType name="memoryStoreEvictionPolicyType">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="LRU" />
      <xsd:enumeration value="LFU" />
      <xsd:enumeration value="FIFO" />
    </xsd:restriction>
  </xsd:simpleType>
  <xsd:complexType name="managerType">
    <xsd:attribute name="ref" type="xsd:string" />
  </xsd:complexType>
  <xsd:element name="cache" type="cacheType" />
</xsd:schema>

And for those, like me, that prefer a graphic display:

Mapping the schema

The schema creation is only the first part. Now, we have to make Spring aware of it. Create the file META-INF/spring.schemas and write in the following line is enough:

http\://blog.frankel.ch/spring/schema/custom.xsd=ch/frankel/blog/spring/authoring/custom.xsd

Just take care to insert the backslash, otherwise it won’t work. It maps the schema declaration in the XML to the real file that will be used from the jar!

Before going further, and for the more curious, just notice that in spring-beans.jar (v3.0), there’s such a file. Here is it’s content:

http\://www.springframework.org/schema/beans/spring-beans-2.0.xsd=org/springframework/beans/factory/xml/spring-beans-2.0.xsd
http\://www.springframework.org/schema/beans/spring-beans-2.5.xsd=org/springframework/beans/factory/xml/spring-beans-2.5.xsd
http\://www.springframework.org/schema/beans/spring-beans-3.0.xsd=org/springframework/beans/factory/xml/spring-beans-3.0.xsd
http\://www.springframework.org/schema/beans/spring-beans.xsd=org/springframework/beans/factory/xml/spring-beans-3.0.xsd
http\://www.springframework.org/schema/tool/spring-tool-2.0.xsd=org/springframework/beans/factory/xml/spring-tool-2.0.xsd
http\://www.springframework.org/schema/tool/spring-tool-2.5.xsd=org/springframework/beans/factory/xml/spring-tool-2.5.xsd
http\://www.springframework.org/schema/tool/spring-tool-3.0.xsd=org/springframework/beans/factory/xml/spring-tool-3.0.xsd
http\://www.springframework.org/schema/tool/spring-tool.xsd=org/springframework/beans/factory/xml/spring-tool-3.0.xsd
http\://www.springframework.org/schema/util/spring-util-2.0.xsd=org/springframework/beans/factory/xml/spring-util-2.0.xsd
http\://www.springframework.org/schema/util/spring-util-2.5.xsd=org/springframework/beans/factory/xml/spring-util-2.5.xsd
http\://www.springframework.org/schema/util/spring-util-3.0.xsd=org/springframework/beans/factory/xml/spring-util-3.0.xsd
http\://www.springframework.org/schema/util/spring-util.xsd=org/springframework/beans/factory/xml/spring-util-3.0.xsd

It brings some remarks:

  • Spring eat their own dogfood (that’s nice to know)
  • I didn’t look into the code but I think that’s why XML validation of Spring’s bean files never complain about not finding the schema over the Internet (a real pain in production environment because of firewall security issues). That’s because the XSD are looked inside the jar
  • If you don’t specify the version of the Spring schema you use (2.0, 2.5, 3.0, etc.), Spring will automatically upgrade it for you with each major/minor version of the jar. If you want this behaviour, fine, if not, you’ll have to specify version

Creating the parser

The previous steps are only meant to validate the XML so that the eternal attribute will take a boolean value, for example. We still did not wire our namespace into Spring factory. This is the goal of this step.

The first thing to do is create a class that implement org.springframework.beans.factory.xml.BeanDefinitionParser. Looking at its hierarchy, it seems that the org.springframework.beans.factory.xml.AbstractSimpleBeanDefinitionParser is a good entry point since:

  • the XML is not overly complex
  • there will be a single bean definition

Here’s the code:

public class EhCacheBeanDefinitionParser extends AbstractSimpleBeanDefinitionParser {

  private static final List&amp;amp;amp;amp;amp;amp;lt;String&amp;amp;amp;amp;amp;amp;gt; PROP_TAG_NAMES;

  static {

  PROP_TAG_NAMES = new ArrayList();

    PROP_TAG_NAMES.add("eternal");
    PROP_TAG_NAMES.add("cacheName");
    PROP_TAG_NAMES.add("maxElementsInMemory");
    PROP_TAG_NAMES.add("maxElementsOnDisk");
    PROP_TAG_NAMES.add("overflowToDisk");
    PROP_TAG_NAMES.add("diskExpiryThreadIntervalSeconds");
    PROP_TAG_NAMES.add("diskPersistent");
    PROP_TAG_NAMES.add("timeToLive");
    PROP_TAG_NAMES.add("timeToIdle");
  }

  @Override
  protected Class getBeanClass(Element element) {

    return EhCacheFactoryBean.class;
  }

  @Override
  protected boolean shouldGenerateIdAsFallback() {

    return true;
  }

  @Override
  protected void doParse(Element element, ParserContext parserContext, BeanDefinitionBuilder builder) {

    for (String name : PROP_TAG_NAMES) {

      String value = element.getAttribute(name);

      if (StringUtils.hasText(value)) {

        builder.addPropertyValue(name, value);
      }
    }

    NodeList nodes = element.getElementsByTagNameNS("http://blog.frankel.ch/spring/ehcache", "manager");

    if (nodes.getLength() &amp;amp;amp;amp;amp;amp;gt; 0) {

      builder.addPropertyReference("cacheManager",
        nodes.item(0).getAttributes().getNamedItem("ref").getNodeValue());
    }

    String msep = element.getAttribute("memoryStoreEvictionPolicy");

    if (StringUtils.hasText(msep)) {

      MemoryStoreEvictionPolicy policy = MemoryStoreEvictionPolicy.fromString(msep);

      builder.addPropertyValue("memoryStoreEvictionPolicy", policy);
    }
  }
}

That deserves some explanations. The static block fills in what attributes are valid. The getBeanClass() method returns what class will be used, either directly as a bean or as a factory. The shouldGenerateIdAsFallback() method is used to tell Spring that when no id is supplied in the XML, it should generate one. That makes it possible to create pseudo-anonymous beans (no bean is really anonymous in the Spring factory).

The real magic happen in the doParse() method: it just adds every simple property it finds in the builder. There are two interesting properties though: cacheManager and memoryStoreEvictionPolicy.

The former, should it exist, is a reference on another bean. Therefore, it should be added to the builder not as a value but as a reference. Of course, the code doesn’t check if the developer declared the cache manager as an anonymous bean inside the ehcache but the schema validation took already care of that.

The latter just uses the string value to get the real object behind and add it as a property to the builder. Likewise, since the value was enumerated on the schema, exceptions caused by a bad syntax cannot happen.

Registering the parser

The last step is to register the parser in Spring. First, you just have to create a class that extends org.springframework.beans.factory.xml.NamespaceHandlerSupport and register the handler under the XML tag name in its init() method:

public class EhCacheNamespaceHandler extends NamespaceHandlerSupport {

  public void init() {

    registerBeanDefinitionParser("cache", new EhCacheBeanDefinitionParser());
  }
}

Should you have more parsers, just register them in the same method under each tag name.

Second, just map the formerly created namespace to the newly created handler in a file META-INF/spring.handlers:

http\://blog.frankel.ch/spring/ehcache=ch.frankel.blog.spring.authoring.ehcache.EhCacheNamespaceHandler

Notice that you map the declared schema file to the real schema but the namespace to the handler.

Conclusion

Now, when faced by overly verbose bean configuration, you have the option to use this nifty 4-steps techniques to simplify it. This technique is of course more oriented toward product providers but can be used by projects, provided the time taken to author a namespace is a real time gain over normal bean definitions.

You will find the sources for this article here in Maven/Eclipse format.

To go further

  • Spring authoring: version 2.0 is enough since nothing changed much (at all?) with following versions
  • Spring’s Javadoc relative to authoring

HTML UI over a REST backend

December 26th, 2009 Nicolas Frankel No comments

In this article, I will show you that forms sent by browsers to a server can only use GET and POST HTTP methods and which workaround to apply in order to still use a REST backend.

REST web-services are becoming more and more common in today’s landscape. We can only imagine the reasons, but I guess that SOAP’s structured approach is only useful in the most complex use-cases and is overkill the rest of the time. On the contrary, REST is quite easy to setup.

In the Java world, REST has its own specification, JAX-RS with JSR 311, and is supported by all major frameworks :

In my humble opinion, it is not inconceivable that, given the adoption of REST, the next step would be the development of REST backends accessible by standard clients and browsers alike, in order to factorize code and thus decrease costs.

Traditionallly, making a resource accessible on the web is realized through a hyperlink. Hyperlinks use the GET method. Likewise, REST is based on using the HTTP methods where each has an associated verb:

HTTP method Verb
POST Create
GET Read
PUT Update
DELETE Delete

Unluckily, the real world comes into play. In order for this to work, the HTML 4 specification is a big obstacle in the way:

The FORM element [...]

Method: this attribute specifies which HTTP method will be used to submit the form data set. Possible (case-insensitive) values are “get” (the default) and “post”.

The implication is that HTML 4 compliant browsers will only let you read and create but not update nor delete. For example, the versions of Firefox and Internet Explorer I used, respectively 3.5.6 and 6.0 (!) use POST for POST and GET for everything else. That’s a real case against using REST. There are some workarounds, though.

The first one is not to use the HTTP method for the verb but to put it in the URL. Thus

  • to access a customer, one would use http://frankel.ch/get/customer/1
  • and to remove one, http://frankel.ch/delete/customer/1

This clearly violates REST’s principle of using the HTTP method for the verb but this works for simple cases.

The second workaround is a bit more complex but it has two big advantages: it conforms to REST principles and it is usable with most modern-days browsers.

The trick is to use a neat little object created by Microsoft but now implemented everywhere, the XMLHttpRequest object. Yes, this is the object at the foundation of Ajax but is is also the solution of our problem. XMLHttpRequest’s open() method takes two parameters, the first being the HTTP method to use for accessing the url (which is the second parameter). Now you can pass the verb you want, like the following example.

xmlhttp = new XMLHttpRequest();
xmlhttp.open('DELETE', 'http://frankel.ch/customer/1');
xmlhttp.send(null);

Morevover, you can not only use the 4 CRUD verbs but also any verb you want.

This solution has two main limitations:

  • it requires modern-day browsers since the XMLHttpRequest object is not available in more ancient ones
  • it requires JavaScript, so you need to enable it on client browsers and you need to manage browser compatibilities. For example, the above script with Firefox 3.5 but not IE 6 (and to be blunt, I’m not really bothered by it)

It seems this hack will be rendered useless by the HTML 5 specifications which declares (in their draft state) that all 4 basic CRUD verbs will have to be supported.

You can find an example for this article here. Tu use, navigate to http://localhost:8080/httpmethods/ if deployed in Eclipse. It was tested on Firefox 3.5 and works on it flawlessly. On the contrary, it doesn’t work for IE 8 since my script is not crafted to be cross-browsers. For Tomcat users, you have to add JSTL capabilities to Tomcat (meaning adding standard.jar and jstl.jar in the commons/lib directory).

To go further :

Categories: JEE Tags: ,