Archive

Archive for March, 2010

Let others work for you with Selenium

March 29th, 2010 No comments

With a provocation this big, I hope I’ve caught your attention. So, let’s draw some lines: the objective is, of course, not to let others do you work. It is to share work once between the client, the business analyst and the person that will run the acceptance test.

From my point of view, all these persons do the same work:

  • the client wants features. As such, he will test those features when the product will be delivered
  • the business analyst will describe those features in details. He has to know the intricacies of each use case
  • the person that is in charge of acceptance test will likely run them more than once, probably with each delivery

As such, it is a mistake for these scenarios to be written thrice. At first, the client should describe the acceptance test that he will run: it is on these scenarios that the product will be accepted or not. The business analyst will refine these, when he describes the business. And then the acceptance test will be run.

Why not let the one of the actors do the work? In fact, in a webapp, such thing is possible with Selenium. Selenium is a suite of tools to automate your web acceptance tests. Let’s just say that one of them, Selenium IDE, is a neat little plugin that finds its way into Firefox (one more reason to use it) and acts as a recorder for interactions and page flow. The record can be exported in many formats (JUnit and TestNG for Java, but other languages are supported), modified, then put in a continuous build.

Let’s imagine that one of our trio records the use-cases. He can then share them with the other two, so that all three have the same scenarios to run.

On one of my latest project, I convinced the project manager to use Selenium IDE. He seems quite happy with it so I hope I will be able to go further the route I described earlier. I would be very interested in your own experience in this field.

Categories: Development Tags: ,

Bean validation and JSR 303

March 22nd, 2010 14 comments

In this article, I will show you how to use the new Bean Validation Framework aka JSR-303.

The legacy

Before getting the result that is JSR 303, aka the Bean Validation framework, there were two interesting attempts at validation framework. Each came from an end of the layers and focused on its scope.

Front-end validation

Struts was the framework to learn and use on the presentation layer in 2001-2002. Struts uses the MVC model and focus on Controllers, which are represented in Struts with Action.  Views are plain JSP and Struts uses ActionForm in order to pass data from Controllers to Views and vice-versa. In short, those are POJO that the framework uses to interact with the View.

As a presentation layer framework, Struts concern is validating user input. Action forms have a nifty method called validate(). The signature of this method is the following:

public ActionErrors validate(ActionMapping mapping, HttpServletRequest request)

The developer has to check whether the form is valid then fill the ActionErrors object (basically a List) if it’s not the case. Struts then redirects the flow to an error page (the input) if the ActionErrors object is not empty. Since manual checking is boring and error-prone, it may be a good idea to automate such validation. Even at that time, declarative validation was considered to be the thing.

This is the objective of Apache Commons Validator. Its configuration is made through XML. You specify:

  • the validators you have access to. There are some built-in but you can add your own
  • the associations between beans and validators: which beans will be validated by which rules

Though Struts tightly integrates Commons Validator, you can use the latter entirely separately. However, the last stable version (1.3.1) was released late 2006. The current developed version is 1.4 but the Maven site hasn’t been updated since early 2008. It is a bit left aside for my own tatse so I rule it out for my validation needs save when I am forced to use Struts.

In this case it is mandatory for me to use it since the Struts plugin knows how to use both XML configuration files to also produce JavaScript client-side validation.

Back-end validation

Previously, we saw that the first validation framework came from user input. At the other end of the specter, inserting/updating data does not require such validation since constraints are enforced in the database. For example, trying to insert a 50 characters length string into a VARCHAR(20) column will fail.

However, letting the database handle validation has two main drawbacks:

  • it has a performance cost since you need to connect to the database, send the request and handle the error
  • such error cannot be easily mapped to a Java exception and if possible, to a particular attribute in error

In the end, it is better to validate the domain model in the Java world, before sending data to the database. Such was the scope of Hibernate Validator. Whereas Commons Validator configuration is based on XML, Hibernate Validator is based on Java 5 annotations.

Even if Hibernate Validator was designed to validate the domain model, you could use it to validate any bean.

JSR 303 Bean Validation

Finally, JSR 303 came to fruitition. Two important facts: it is end-agnostic, meaning you can use it anywhere you like (front-end, back-end, even DTO if you follow this pattern) and its reference implementation is Hibernate Validator v4.

JSR 303 features include:

  • validation on two different levels: attribute or entire bean. That was not possible with Hibernate Validator (since it was database oriented) and only possible with much limitations with Commons Validator
  • i18n ready and message are parameterized
  • extensible with your own validators
  • configurable with annotations or XML. In the following, only the annotation configuration will be shown

In JSR 303, validation is the result of the interaction between:

  • the annotation itself. Some come with JSR 303, but you can build your own
  • the class that will validate the annotated bean

Simplest example

The simplest example possible consist of setting a not-null constraint on an attribute of a class. This is done simply so:

public class Person {

  private String firstName;

  @NotNull
  public String getFirstName() {

    return firstName;
  }

  // setter
}

Note that the @NotNull annotation can be placed on the attribute or on the getter (just like in JPA). If you use Hibernate, it can also use your JSR 303 annotations in order to create/update the database schema.

Now, in order to validate an instance of this bean, all you have to do is:

Set<ConstraintViolation<Person>> violations = validator.validate(person);

If the set is empty, the validation succeeded, it not, it failed: the principle is very similar to both previous frameworks.

Interestingly enough, the specs enforce that constraints be inherited. So, if a User class inherits from Person, its firstName attribute will have a not-null constraint too.

Constraints groups

On the presentation tier, it may happen that you have to use the same form bean in two different contexts, such as create and update. In both contexts you have different constraints. For example, when creating your profile, the username is mandatory. When updating, it cannot be changed so there’s no need to validate it.

Struts (and its faithful ally Commons Validator) solve this problem by associating the validation rules not with the Java class but with the mapping since its scope is the front-end. This is not possible when using annotations. In order to ease bean reuse, JSR 303 introduce constraint grouping. If you do not specify anything, like previously, your constraint is assigned to the default group, and, when validating, you do so in the default group.

You can also specify groups on a constraint like so:

public class Person {

  private String firstName;

  @NotNull(groups = DummyGroup)
  public String getFirstName() {

    return firstName;
  }

  // setter
}

So, this will validate:

Person person = new Person();

// Empty set
Set<Constraintviolation<Person>> violations = validator.validate(person);

This will also:

Person person = new Person();

// Empty set
Set<Constraintviolation<Person>> violations = validator.validate(person, Default.class);

And this won’t:

Person person = new Person();

// Size 1 set
Set<Constraintviolation<Person>> violations = validator.validate(person, DummyGroup.class);

Custom constraint

When done playing with the built-in constraints (and the Hibernate extensions), you will probably need to develop your own. It is very easy: constraints are annotations that are themselves annotated with @Constraint. Let’s create a constraint that check for uncapitalized strings:

@Target( { METHOD, FIELD, ANNOTATION_TYPE })
@Retention(RUNTIME)
@Constraint(validatedBy = CapitalizedValidator.class)
public @interface Capitalized {

  String message() default "{ch.frankel.blog.validation.constraints.capitalized}";

  Class<?>[] groups() default {};

  Class<? extends Payload>[] payload() default {};
}

The 3 elements are respectively for internationalization, grouping (see above) and passing meta-data. These are all mandatory: if not defined, the framework will not work! It is also possible to add more elements, for example to parameterize the validation: the @Min and @Max constraints use this.

Notice there’s nothing that prevents constraints from being applied to instances rather than attributes, this is defined by the @Target and is a design choice.

Next comes the validation class. It must implement ConstraintValidator<?,?>:

public class CapitalizedValidator implements ConstraintValidator<Capitalized, String> {

  public void initialize(Capitalized capitalized) {} 

  public boolean isValid(String value, ConstraintValidatorContext context) {

    return value == null || value.equals(WordUtils.capitalizeFully(value));
  }
}

That’s all! All you have to do now is annotate attributes with @Capitalized and validate instances with the framework. There’s no need to register the freshly created validator.

Constraints composition

It is encouraged to create simple constraints then compose them to create more complex validation rules. In order to do that, create a new constraint and annotate it with the constraints you want to compose. Let’s create a constraint that will validate that a String is neither null nor uncapitalized:

@NotNull
@Capitalized
@Target( { METHOD, FIELD, ANNOTATION_TYPE })
@Retention(RUNTIME)
@Constraint(validatedBy = {})
public @interface CapitalizedNotNull {

  String message() default "{ch.frankel.blog.validation.constraints.capitalized}";

  Class<?>[] groups() default {};

  Class<? extends Payload>[] payload() default {};
}

Now, annotate your attributes with it and watch the magic happen!

Of course, if you want to prevent constraint composition, you’ll have to restrain the @Target values to exclude ANNOTATION_TYPE.

Conclusion

This article only brushed the surface of JSR 303. Nevertheless, I hoped it was a nice introduction to its features and gave you the desire to look into it further.

You can find here the sources (and more) for this article in Eclipse/Maven format.

To go further:

Categories: Java Tags: ,

Managing web sessions

March 8th, 2010 3 comments

In the previous article, I set up a cluster of 2 Tomcat instances in order to achieve load-balacing. It also offered failover capability. However, when using this feature, the user session was lost when changing node. In this article, I will show you how this side-effect can be avoided.

Reminder: the HTTP protocol is inherently disconnected (as opposed to FTP which is connected). In HTTP, the client sends a request to a server, it gets its response and that’s the end. The server cannot natively associate a request to a previous request made by the same client. In order to use HTTP for applications purpose, we needed to group such requests. Session is a label for this grouping feature.

This is done through a token, passed to the client on the first request. The token is passed with a cookie if possible, or appended to the URL if not. Interestingly enough, this is the same for PHP (token named PHPSESSID) as for JEE (token named JSESSIONID). In both case, we use a stateless protocol and tweak it so that it appears stateful. This pseudo-statefulness make possible application-level features, such as authentication/authorization or shopping cart.

Now, let’s take a real use case. I’m browsing through an online shop. I have already put some article in my cart. When I decide to finally go to the payment, I find myself with an empty basket! What happened? Unbeknownst to me, the node which hosted my session crashed and I was transparently rerouted to a working node, without my session ID, and thus, without the content of my cart.

Such thing can not happen in real life, since such a shop will probably be put out of business if this happens too often. There are basically 3 strategies to adopt in order to avoid such loss.

Sessions are evil

From time to time, I stumble upon articles blaming sessions and labeling them as evil. While not an universal truth, using sessions in a bad way can have negative side-effects.

The most representative bad usage of session if putting everything in them. I’ve seen lazy developers put collections in session in order to manage paging. You pass a statement once, put the result in the session, and manage paging on the sessionized result. Since collection’s size is not constrained, such use does not scale well with the number of users increasing. In most cases, everything goes fine in development but you’re soon overwhelmed by unusual response time or even OutOfMemoryError in production.

If you think sessions are evil, some solutions are:

  • Store data in the database: now your data has to go from the front-end to the back-end to be saved, and then back again to be used. Classic relational database may not be the solution you’re looking for. Most high-traffic low-response time take the NoSQL route, although I don’t know if they use it for session storage purpose
  • Store data on the client side through cookies: your clients needs to have cookie-enabled browser. Furthermore, data will be sent with every request/response so don’t overuse too much

The second option has the advantage of freeing you of session ID management. However, for both solution, your application code needs to implement the storage part.

Besides, nothing stops you from using sessions and using the extension points of your application server to use cookie storage instead of the default behaviour (mostly in memory). I wouldn’t recommend that though.

Server session replication

Another solution is to embrace session – this is a JEE feature after all – but to use session replication in order to avoid session data loss. Session replication is not a JEE feature. It is a proprietary feature, offered by many (if not all) application servers that is entirely independent from your code: your code uses session, and it is magically replicated by the server across cluster nodes.

There are two constraints common to session replication amongst all servers:

  • Use the <distributable /> tag in the web.xml
  • Only put in session instances of classes that are java.lang.Serializable

IMHO, these rules should be enforced on all web applications, whether currently deployed on a cluster or not, since they are not very restrictive. This way, deploying an application on a cluster will tends toward a no-operation.

Strategies available for session replication are application server dependent. However, they are usually based on the following implementations:

  • In memory replication: each server stores all servers session datat. When updating, it broadcasts to all nodes the modified session (or the delta, based on the strategy available/used). This implementation heavily uses the network and the memory
  • Database persistence
  • File persistence: the file system used should be available to all cluster nodes

For our simple example, I will just show you how to use in-memory session replication in Tomcat. The following are the steps one should take in order to do so. Please notice that one should first undertake what is described in the Tomcat clustering article.

Note: Tomcat 5.5 has the cluster configuration commented in server.xml. Tomcat 6 does not. Here is the default clustering configuration:

<Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster"
  managerClassName="org.apache.catalina.cluster.session.DeltaManager"
  expireSessionsOnShutdown="false"
  useDirtyFlag="true"
  notifyListenersOnReplication="true">

  <Membership className="org.apache.catalina.cluster.mcast.McastService"
    mcastAddr="228.0.0.4"
    mcastPort="45564"
    mcastFrequency="500"
    mcastDropTime="3000"/>

  <Receiver className="org.apache.catalina.cluster.tcp.ReplicationListener"
    tcpListenAddress="auto"
    tcpListenPort="4001"
    tcpSelectorTimeout="100"
    tcpThreadCount="6"/>

  <Sender className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"
    replicationMode="pooled"
    ackTimeout="15000"
    waitForAck="true"/>

  <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve"
    filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

  <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer"
    tempDir="/tmp/war-temp/"
    deployDir="/tmp/war-deploy/"
    watchDir="/tmp/war-listen/"
    watchEnabled="false"/>

  <ClusterListener className="org.apache.catalina.cluster.session.ClusterSessionListener"/>
</Cluster>

First, make sure that the mcastAddrand mcastPort of the <Membership> tag are the same. This is validated when starting a second node with the following log in the first:

INFO: Replication member added:org.apache.catalina.cluster.mcast.McastMember[tcp
://10.138.97.43:4001,catalina,10.138.97.43,4001, alive=0]

This insures that all nodes of a cluster are able to communicate with each other. From this point on, considering all other configuration is left by default, sessions are replicated in memory in all nodes of the cluster. Thus, you don’t need sticky session anymore. This is not enough for failover though, since removing a node from the cluster will still lead a new request to be assigned a new session ID, thus preventing access to your previous session data.

In order to also route session IDs, you need to specify two additional tags in <Cluster>:

<Valve className="org.apache.catalina.cluster.session.JvmRouteBinderValve" enabled="true" sessionIdAttribute="takeoverSessionid"/>
<ClusterListener className="org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener" />

The valve redirects requests to another node with the previous session ID. The cluster listener receives session ID cluster change event. Now, removing a cluster node is seamless (apart from latency for redirected) for clients whose session ID was redirected to this node.

Last minute note: the previous setup uses the standard session manager. I was recently made aware of a third-party manager that also handles session cookies when a node fails, thus reducing the configuration hassle. Such product is Memcached Session Manager and is based on Memcached. Any feedback on the use of this product is welcome.

Third party session replication

The previous solution has the disadvantage of specific server configuration. Though it does not impact development, it needs to be done for every server type in a different manner. This could be a burden if you happen to have different server types in your enterprise.

Using third party products is a remedy to this. Terracotta is such a product: morevoer, by providing a set number of Terracotta nodes, you avoid broadcasting your session changes to all server nodes, like in the Tomcat replication previous example.

In the following, the server is the Terracotta replication server and the clients are the Tomcat instances. In order to set up Terracotta, two steps are mandatory:

  • create the configuration for the server. In order to be used, name it tc-config.xml and put it in the bin directory. In our case, this is it:
    <tc:tc-config xmlns:tc="http://www.terracotta.org/config"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-5.xsd">
    ...
        <modules>
          <module name="tim-tomcat-5.5"/>
        </modules>
    ...
          <web-applications>
    	    <!-- The webapp(s) to watch -->
            <web-application>servlets-examples</web-application>
    
          </web-applications>
    ...
    </tc:tc-config>

    Note: the default installed module is for Tomcat 6. In case you need the Tomcat 5.5 module, you have to launch the Terracotta Integration module Management (TIM) and use it to download the correct module. For users behind an Internet proxy, this is made possible by updating tim-get.properties with the following lines:
    org.terracotta.modules.tool.proxyUrl = …
    org.terracotta.modules.tool.proxyAuth = …

  • add to all client launch scripts the following lines:
    TC_INSTALL_DIR=
    
    TC_CONFIG_PATH=:
    
    . ${TC_INSTALL_DIR}/bin/dso-env.sh -q
    
    export JAVA_OPTS="$JAVA_OPTS $TC_JAVA_OPTS"

Now we’re back to session replication and failover but the configuration is usable across different application servers.

In order to see what is stored, you can also launch the Terracotta Developer Console.

Conclusion

There are 3 basic strategies to manage session failover over cluster nodes. The first one is not to use session at all: it has major consequences on your development time, since your application has to do it directly. The second one is to look at the server documentation to look how it is done with a specific server. This ties your session management to a single product. Last but not least, you can use a third-party product. This has the advantage to move the configuration outside the scope of your specific server, thus letting you move with less hassle from one server to the next and still enjoy the benefits of session failover. Were I a system engineer, this is the solution I would recommend since it is the most flexible.

To go further:

Categories: Development Tags: , ,

Spring Persistence with Hibernate

March 1st, 2010 No comments

This review is about Spring Persistence with Hibernate by Ahmad Reza Seddighi from Packt Publishing.

Facts

  1. 15 chapters, 441 pages, 38€99
  2. This book is intended for beginners but more experienced developers can learn a thing or two
  3. This book covers Hibernate and Spring in relation to persistence

Pros

  1. The scope of this book is what makes it very interesting. Many books talk about Hibernate and many talk about Spring. Yet, I do not know of many which talk about the use of both in relation to persistence. Explaining Hibernate without describing the transactional side is pointless
  2. The book is well detailed, taking you by the hand from the bottom to reach a good level of knowledge on the subject
  3. It explains plain AOP, then Spring proxies before heading to the transactional stuff

Cons

  1. The book is about Hibernate but I would have liked to see a more tight integration with JPA. It is only described as an another way to configure the mappings
  2. Nowadays, I think Hibernate XML configuration is becoming obsolete. The book views XML as the main way of configuration, annotations being secondary
  3. Some subjects are not documented: for some, that’s not too important (like Hibernate custom SQL operations), for others, that’s a real loss (like the @Transactional Spring annotation)

Conclusion

Despite some minor flaws, Spring Persistence with Hibernate let you go head first into the very complex sujbect of Hibernate. I think that Hibernate has a very low entry ticket, and you can be more productive with it very quickly. On the downside, mistakes will cost you much more than with old plain JDBC. This book serves you Hibernate and Spring concepts on a platter, so you will make less mistakes.

Categories: Book review Tags: , ,

Hibernate hard facts – Part 5

March 1st, 2010 6 comments

In the fifth article of this serie, I will show you how to manage logical DELETE in Hibernate.

Most of the time, requirements are not concerned about deletion management. In those cases, common sense and disk space plead for physical deletion of database records. This is done through the DELETE keyword in SQL. In turn, Hibernate uses it when calling the Session.delete() method on entities.

Sometimes, though, for audit or legal purposes, requirements enforce logical deletion. Let’s take a products catalog as an example. Products regularly go in and out of the catalog. New orders shouldn’t be placed on outdated products. Yet, you can’t physically remove product records from the database since they could have been used on previous orders.

Some strategies are available in order to implement this. Since I’m not a DBA, I know of only two. From the database side, both add a column, which represents the deletion status of the record:

  • either a boolean column which represent either active or deleted status
  • or, for more detailed information, a timestamp column which states when the record was deleted; a NULL meaning the record is not deleted and thus active

Managing logical deletion is a two steps process: you have to manage both selection so that only active records are returned and deletion so that the status marker column is updated the right way.

Selection

A naive use of Hibernate would map this column to a class attribute. Then, selecting active records would mean a WHERE clause on the column value and deleting a record would mean setting the attribute and calling the update() method. This approach has the merit of working. Yet, it fundamentally couples your code to your implementation. In the case you migrate your status column from boolean to timestamp, you’ll have to update your code everywhere it is used.

The first thing you have to do to mitigate the effects of such a migration is to use a filter.

Hibernate3 provides an innovative new approach to handling data with “visibility” rules. A Hibernate filter is a global, named, parameterized filter that can be enabled or disabled for a particular Hibernate session.

Such filters can the be used throughout your code. Since the filtering criteria is thus coded in a single place, updating the database schema has only little incidence on your code. Back to the product example, this is done like this:

@Entity
@FilterDef(name = "activeProducts")
@Filter(name = "activeProducts", condition = "DELETION_DATE IS NULL")
public class Product {

  @Id
  @Column(nullable = false)
  @GeneratedValue(strategy = AUTO)
  private Integer id;

...
}

Note: in the attached source, I also map the DELETION_DATE on an attribute. This is not needed in most cases. In mine, however, it permits me to auto-create the schema with Hibernate.

Now, the following code will filter out logically deleted records:

session.enableFilter("activeProducts");

In order to remove the filter, either use Session.disableFilter() or use a new Session object (remember that factory.getCurrentSession() will probably use the same, so factory.openSession() is in order).

Deletion

The previous step made us factorize the “select active-only records” feature. Logically deleting a product is still coupled to your implementation. Hibernate let us decouple further: you can overload any CRUD operations on entities! Thus, deletion can be overloaded to use an update of the right column. Just add the following snippet to your entity:

@SQLDelete(sql = "UPDATE PRODUCT SET DELETION_DATE=CURRENT_DATE WHERE ID=?")

Now, calling session.delete() on an Product entity will produce the updating of the record and the following output in the log:

UPDATE PRODUCT SET DELETION_DATE=CURRENT_DATE WHERE ID=?

With CRUD overloading, you can even suppress the ability to select inactive records altogether. I wouldn’t recommend this approach however, since you wouldn’t be able to select inactive records then. IMHO, it’s better to stick to filters since they can be enabled/disabled when needed.

Conclusion

Hibernate let you loosen the coupling between your code and the database so that you can migrate from physical deletion to logical deletion with very localized changes. In order to do this, Hibernate offers two complementary features: filters and CRUD overloading. These features should be part of any architect’s bag of tricks since they can be lifesavers, like in previous cases.

You can find the sources of this article here in Maven/Eclipse format.

To go further:

Categories: Java Tags: