Understanding servlet state

Managing server-side state is one of the most complicated and error-prone aspects of web application design, and one of the areas where Tapestry provides the most benefit. Generally speaking, Tapestry applications which are functional within a single server will be functional within a cluster with no additional effort. This doesn't mean planning for clustering, and testing of clustering, is not necessary; it just means that, when using Tapestry, it is possible to narrow the design and testing focus.

The point of server-side state is to ensure that information about the user acquired during the session is available later in the same session. The canonical example is an application that requires some form of login to access some or all of its content; the identify of the user must be collected at some point (in a login page) and be generally available to other pages.

The other aspect of server-side state concerns failover. Failover is an aspect of highly-available computing where the processing of the application is spread across many servers. A group of servers used in this way is referred to as a cluster. Generally speaking (and this may vary significantly between vendor's implementations) requests from a particular client will be routed to the same server within the cluster.

In the event that the particular server in question fails (crashes unexpectedly, or otherwise brought out of service), future requests from the client will be routed to a different, surviving server within the cluster. This failover event should occur in such a way that the client is unaware that anything exceptional has occured with the web application; and this means that any server-side state gathered by the original server must be available to the backup server.

The main mechanism for handling this using the Java Servlet API is the HttpSession. The session can store attributes, much like a Map. Attributes are object values referenced with a string key. In the event of a failover, all such attributes are expected to be available on the new, backup server, to which the client's requests are routed.

Different application servers implement HttpSession replication and failover in different ways; the servlet API specification is delibrately non-specific on how this implementation should take place. Tapestry follows the conventions of the most limited interpretation of the servlet specification; it assumes that attribute replication only occurs when the HttpSession setAttribute() method is invoked [5].

Attribute replication was envisioned as a way to replicate simple, immutable objects such as String or Integer. Attempting to store mutable objects, such as List, Map or some user-defined class, can be problematic. For example, modifying an attribute value after it has been stored into the HttpSession may cause a failover error. Effectively, the backup server sees a snapshot of the object at the time that setAttribute() was invoked; any later change to the object's internal state is not replicated to the other servers in the cluster! This can result in strange and unpredictable behavior following a failover.

Tapestry attempts to sort out the issues involving server-side state in such a way that they are invisible to the developer. Most applications will not need to explicitly access the HttpSession at all, but may still have significant amounts of server-side state. The following sections go into more detail about how Tapestry approaches these issues.



[5] This is the replication strategy employed by BEA's WebLogic server.