Pseudo Sessions for JSP, Servlets and HTTP
by Budi Kurniawan
03/01/2001
Session tracking is a method for maintaining the state of a series
of requests from the same user. However, this method is not a perfect
solution to many situations, especially if your application needs to
scale. This article discusses pseudo sessions and how they can
overcome drawbacks in the session tracking method. At the end of the
article you will find a project implementing the pseudo session
mechanism in a bean that you can use in any JavaServer Pages (JSP)
applications.
HTTP is by design a stateless protocol. The implication is Web
applications do not have information about previous HTTP requests by
the same user. One of the ways to maintain state is to use the session
tracking feature from the servlets or JSP container. The servlet API
specification defines a simple HttpSession interface that allows a
servlet container to use any number of approaches to track a user's
session without involving the developer in the nuances of any one
approach.
The HttpSession interface
In particular, the HttpSession interface provides
methods that store and return standard session properties, such as a
session identifier, and application data, which is stored as a
name-value pair. In short, the HTTPSession interface provides a
seamless way to store an object into memory and then retrieve it when
the same user comes back later. The method for storing objects in a
session is setAttribute(String s, Object o), and the
method for retrieving stored objects in a session is the
getAttribute(String s) method.
Also, in the HTTP protocol, there is no explicit termination signal
when a client is no longer active. Not having explicit termination and
not knowing whether or not a user will come back means there will be a
pile of HttpSession objects in memory as new users visits
your Web application. Fortunately, the servlet designer has devised a
mechanism that can be used to indicate when a client is no longer
active: a timeout period. If a user does not come back for a certain
period of time, the user's session becomes expired and the
corresponding HttpSession object will be removed from memory. The
default timeout period for sessions is defined by the servlet
container and can be obtained via the
getMaxInactiveInterval method. This timeout can be
changed by using the setMaxInactiveInterval. The timeout
periods used by these methods is defined in seconds. If the timeout
period for a session is set to -1, the session will never
expire.
The getLastAccessedTime method allows a servlet to
determine the last time the session was accessed before the current
request. The session is considered accessed when a request that is
part of the session is handled by the servlet context.
To get a user's HTTPSession object you can use the
getSession method of the HttpServletRequest
object. When you call the method with its create argument as true, the
servlet reference implementation creates a session if necessary. To
properly maintain the session, you must call getSession
before any output to response.
The following code demonstrates the use of the
HttpSession interface to maintain a user's session. For
example, to store a String value "bulbul" associated with
userName, you can use the following code from your JSP
page.
<%
HttpSession session = request.getSession();
String userName = session.getAttribute("userName");
%>
One last thing that's worth mention about the
HTTPSession interface is that a user's session can be
invalidated manually or, depending on where the servlet is running,
automatically. For example, the Java Web Server automatically
invalidates a session when there have been no page requests for some
period of time, 30 minutes by default. To invalidate a session means
to remove the HttpSession object and its values from the system.
Disadvantages of the Session Tracking Mechanism
Session tracking mechanisms come at a price:
- Session objects are stored in memory and consume significant
resources.
- Session tracking relies on cookies. Some users
turn off cookies for various reasons, especially for security
reasons.
- Session tracking uses session identifiers that are
created by the server. In situations where many Web servers and many
JVMs are used, the session tracking simply does not work because
servers do not always understand each other's session
identifiers.
To understand the session tracking mechanism, you first must
understand how sessions work in a servlet/JSP container.
How Session Objects Work with Session Identifiers
By default, your JSP application participates in the session
tracking mechanism. Every time a new user requests a JSP page that
uses HTTPSession objects, the JSP container sends back a
response plus a special number to the browser. This special number is
called a session identifier and is guaranteed as a unique user
identifier. The HTTPSession object resides in memory
waiting for its methods to be called again when the same user
returns.
On the client side, the browser keeps the session identifier and
sends it back to the server on the next request. This session
identifier tells the JSP container that this request is not a first
visit by the user and that a HTTPSession object has been
created for this user. Instead of creating a new
HTTPSession object, the JSP container then looks for a
HTTPSession object with the same session identifier and
associates the request with the HTTPSession object.
Session identifiers are transmitted between the server and the
browser as cookies. What if the browser doesn't accept cookies? All
subsequent requests to the server will not carry a session identifier.
As a result, the JSP container will think it is a request from a new
user, and it will create a HTTPSession object again, and
the previous HTTPSession object remains in memory and the
previous state information for that user is lost.
Further a session identifier can only be recognized by the JSP
container that issues it. If you have copies of your application
installed in more than one machine in a web farm, there must be a way
to guarantee that requests from the same user will be directed to the
server that first handles the user's request.
Pseudo Sessions
The solution to the problems posed by the cookie-based session
tracking mechanism is pseudo sessions, which have the following
properties.
- Objects or values are not stored in memory but stored as text
files. Each text file is associated with a particular user and the
name of the text file serves as the session identifier. Therefore, the
name must be guaranteed unique.
- The text files are stored in
a special directory accessible by all Web servers. Therefore, the
pseudo session can be used in a web farm.
- Session identifiers
are not sent as cookies. They are encoded into URLs, which requires
the rewriting of all hyperlinks, including the
ACTION
attributes of HTML forms.
In addition, the implementation of pseudo sessions must take into
account the following points.
- It must not be application specific and must allow for easy
code reuse for other application developers who want to implement
it.
- For security reasons, there must be a way to generate random
numbers for session identifiers.
- There must be a timeout value to invalidate a session. The same
user who comes back after a certain period of time will get a new
session identifier. This will prevent an unauthorized user from using
someone else's session.
- There must be a mechanism to collect expired sessions by deleting
the corresponding text files.
- A user who comes back with an expired session identifier must not
be allowed to use the old session even though his or her session text
file has not been deleted.
- On the other hand, there must be a mechanism to update the session
text file last modified time so the session will be always current and
valid if the user comes back within the session timeout
period.
The Project
The project described here is simply called PseudoSession and is a
very simple implementation of the pseudo session mechanism. For
portability, it is implemented in a Java Bean called
PseudoSessionBean. To use pseudo sessions in a JSP application, you
need to import this bean into your project and following the
instructions below. The code for the bean is given in Listing 1.
The PseudoSessionBean has the following fields:
public String path;
public long timeOut;
path is the directory path where all session text
files are stored. This must be an area accessible to all web servers
if you are using more than one. The path, however, must not be visible
to users in order to prevent them from accessing the file
directly. One way to do this is to allocate a directory outside the
web root.
timeOut is the time that has to pass after the last
request of a user before the session is invalidated. In the code in
the Listing, timeOut is set to 20 minutes (1,200,000
milliseconds), a somewhat reasonable value. Any user who comes back
after his or her session expires will get a new session
identifier.
The PseudoSessionBean has four methods: getSessionID,
setValue, getValue, and
deleteAllInvalidSessions. These four methods are
explained in detail in the following section.