The mobile environment has a unique set of attributes not found anywhere
else. From intermittently-connected wireless networks to the ever-increasing
diversity of mobile devices, the mobile environment has introduced many new
challenges to building software applications. The impact of these challenges on
server-side software development requires a systematic resolution at the software
infrastructure level. Without such a resolution, application developers will be
forced to produce point solutions, custom code, and patches, creating convoluted
legacy architecture that will result in exorbitant maintenance costs.
J2EE is a server-side technology originally designed to meet the challenges
of Web-based application development. Although it was designed to be flexible
and extensible to meet the future requirements of new technology paradigms, no
augmentation of the J2EE standard has yet occurred to address the challenges of
developing mobile applications. The market pressure to get mobile applications
out to the enterprise workforce, combined with the lack of de facto architectural
leadership from the technology giants in the space, have made point solutions and
custom coding the common practices. J2EE technology, however, will help to extend
support to and standardize mobile application development.
In this article, I will explore the fundamental architecture of J2EE and
how to capitalize on its flexibility, extensibility, and openness by proposing a
new server-side J2EE component model to drastically simplify, standardize, and
enhance mobile application development.
Understanding Mobile Application Development Challenges
New mobile devices continue to rapidly enter the market, each consisting of
varying configurations. Some devices, such as PDAs (personal
digital assistants) and Pocket PCs, have ample memory, large color displays, and
sound capabilities, while others have black-and-white displays and limited
memory. Some devices are shipped with browsers, while other devices focus on
client-side applications. Even devices with browsers that may support the same
markup language will often interpret the markup language quite differently.
Many mobile devices can support Java through a micro version of the Java
Virtual Machine called KVM (Kilo Virtual Machine). KVM can not only run Java
applications built for mobile devices -- the so-called J2ME (Java 2, Micro
Edition) applications -- but also supports a superior user interface and off-line
functionality during instances of intermittent wireless network connections.
In any case, the common characteristic of these applications is their server-side
interaction. They all rely on server-side logic to supply them with the
appropriate data and communicate via Internet standard protocols such as HTTP. This shifts a large chunk of the coordination
burden to the server side; the server-side application logic must be designed to
support each of these devices while fully utilizing all of the device's
features. In the case of browser-based applications (e.g., WML, c-HTML), the
content needs to be formatted properly for the mobile device display and
browser. Even in the case of J2ME applications, a significant amount of
server-side intelligence is required. For example, the proper J2ME application
must be provided -- i.e., the right application must be chosen for the
requesting wireless device, and the version must be checked before delivery.
The wireless network also implicitly affects the development of the
applications. Limited bandwidth, high latency, and intermittent connectivity
often plague wireless networks. Design efforts to combat network challenges,
such as high latency, may involve delivering the maximum amount of content
supported by each device, reducing unnecessary requests (or chattiness) over the
high-latency wireless link. This so-called "forward-caching design" relies on a
forward-caching software infrastructure resource to intelligently detect the
memory of the device making the request, estimate compiled size of content,
paginate the content, deliver it, and transparently maintain the links among the
pages. In the case of J2ME, even greater sophistication is required, since the
software must provide code as well as data. The cache manager
ultimately decides what parts (classes) can be delivered to the mobile device.
The mobile environment also creates a new application paradigm that may
involve multi-modal device access, where a user can swap devices and access
methods such as voice and data in the middle of an application. The application
may require non-HTTP messaging models for notifications, location-based
services, and potentially even a new security model. This new application
paradigm affects server-side development by requiring multi-modal session
managers and access to non-HTTP messaging channels.
Mobile Application Development with J2EE -- First-Generation Approaches
J2EE defines a standard for developing multi-tier enterprise applications by
not only basing them on standardized, modular components, but also providing
services (e.g., secure transactions) to those components. In other words, J2EE
gives the developer a framework and a set of services that simplify the
development of Web-based applications. The framework and the associated services
are implemented and packaged as products called J2EE application servers. The
developer extends the application services (e.g., Servlets, Java Server Pages,
and Enterprise Java Beans) and writes to a set of APIs that solve a significant
portion of the underlying software
challenge for Web-based development. The J2EE application server allows the
developer to focus on application development while providing a clean separation
between the application server and the application code. In the application
code, the developer is encouraged to keep the presentation layer code as
separate as possible from business logic to reduce the maintenance cost.
Some of the basic building blocks of J2EE are:
Servlets
Java Server Pages (JSP)
Enterprise Java Beans (EJB)
Java Messaging Service (JMS)
Java Database Connectivity (JDBC)
Java Naming and Directory Interface (JNDI)
Java Mail
Extensible Markup Language (XML)
Figure 1: Standard J2EE Architecture
Each of these building blocks comes in a container that offers APIs, which
manage a set of services for the building block. J2EE developers use the
container and application server APIs such as JDBC and JNDI to build their applications. Since J2EE was specifically designed for Web application development, most of
the containers are specific to Web applications. Since out-of-the-box containers
are insufficient for mobile application development, the developer must write
extensive custom code to deliver on the mobile application requirements.
Typically, a sound mobile solution must consist of the following elements:
Future-proof presentation management. The application must interface with
many heterogeneous wireless networks and protocols, and control the
content presentation and formatting for multiple mobile devices existing
both today and soon to be released in the future.
Seamless session management. The application must seamlessly initiate, successfully manage, and terminate user sessions while assuming no cookie
support on end-user devices and a sporadically-connected wireless
network.
Robust messaging and notification. The application must communicate with
many different unified messaging resources typically found around a mobile
user (e.g., fax, SMS (Short Messaging Systems), pagers, and voice), using
robust messaging interfaces that enable push/pull functionality in a
synchronous or asynchronous manner.
Open APIs. The application must access existing data access and business
logic layers (i.e., middle tier), back-end data sources (e.g., databases,
ISV business objects), and other software and application server services
through uniform interfaces.
Each requirement of this mobile solution requires a significant amount of
customization work. The J2EE architecture's presentation layer, for example, is
primarily HTML-based, thus limiting the target mobile devices to mostly Web
browsers. Mobile devices and wireless networks, on the other hand, are much more
diverse. Since different devices have different capabilities and may utilize
different markup languages (HTML, WML, HDML, cHTML, VoiceXML), a developer may
choose to build many different presentation layers for the different devices, or
build one presentation layer and apply a transformation for each device. This
problem is commonly called the "Mobile Presentation Challenge." In either case,
device-specific information (i.e., a device profile or device model) is required
in order to choose a presentation layer or a transformation. Developers
typically follow one of these two paths in trying to solve the Mobile
Presentation Challenge.
The Basic Content Transformation Problem
The first route often chosen to overcome the Mobile Presentation Challenge is
called "screen scraping" or transcoding; this requires repurposing and reformatting
HTML presentation content. Transcoding involves taking in the dynamic HTML
pages, removing the tags (i.e., scraping the page), storing the data in a Meta
format, and retagging the content in a different markup language. This can be
achieved with a combination of servlets and JSPs. It is not scalable, though,
due to the time required. In addition, the tightly interwoven
design of the presentation and application logic increases the ongoing maintenance
cost exponentially as the number of supported devices rises.
The second route involves first defining an XSL (Extensible Style Sheets)
library, which is used to transform the single XML presentation layer into a markup language suitable for each client device.
Although this approach has better separation of application logic from
presentation logic, the XSL presentation logic becomes so complex that it soon
requires its own application logic, written in XSL command sequences. This basic
approach of building style sheets for each device becomes costly to maintain and
upgrade as new devices become available, since the embedded conditional
presentation logic needs reworking each time.
A more effective approach might be to model each device by specifying its
features (e.g., display size, supported markup language, etc.), develop a set
of style guides for each device feature, and dynamically assemble the style
sheets based on the features of the device making the request. Although you
would be authoring a significant amount of presentation conversion logic, you
may, in this way, be able to avoid some of the maintenance time of updating a
style sheet library as new devices become available. XSL, while well-positioned
to process text and convert among XML trees, is not a structured programming
language. Thus, XSL is not the ideal choice for creating the complex
presentation conversion logic required to dynamically assemble style sheets.
If this route is chosen, mixing and matching with different
programming languages may be required, presenting the familiar burden of ever-
increasing complexity and maintenance costs.
After the first step of defining the XSL library, the developer builds a
servlet and within it constructs a string buffer of the XML or a DOM (Document
Object Model) for the presentation. The developer must then pass the string
buffer or DOM and the appropriate style sheet (which may or may not be
dynamically assembled based on device features) to an XSLT (XSL Transformer),
which then passes the device-specific markup language in the servlet request.
This development is typically intricate enough that in most implementations
there is a significant amount of integration and coding that must occur within
the servlet. For example, you would be required to identify the appropriate
device, identify the relevant features, and assemble the style sheet based on
these features. A more elegant approach simply uses JSP to lay down the XML and
automatically call the right tranform, so that you focus more on building
the application and less on building and tweaking the presentation look and
feel.
Although challenging on its own, as seen in the two route choices above,
basic mobile presentation is only the beginning of a long, circuitous journey as
far as mobile application challenges are concerned. Unfortunately, neither
approach has the capability to solve the more perplexing problems that one may
face ahead.
Advanced Mobile Presentation -- Content Pagination and
Caching Challenges
Neither of the basic content transformation methods described above
addresses the memory issues of the device or the latency issue of the wireless
link. Consider one scenario: If the content is dynamic, applying a straight
transformation of the XML in a target device's markup language will often result
in a document that is too large for the device to properly display.
One patch is to "page" the data at the data source (i.e., the data access
layer) based on the lowest-common-denominator device to be supported, which will
result in small pages. Unfortunately, this process will lead to burdensome
delays for devices without memory or display restraints, since they will be
making repeated requests for the small pages over high-latency wireless links,
introducing unnecessary chattiness into applications.
Another option inserts logic at the presentation layer based on the memory
attribute of the device and "pages" the resulting XML at this appropriate layer.
This option requires introducing yet another layer of complex application logic
at the presentation layer, violating a basic principle of sound J2EE design and
increasing the maintenance cost. In either case, the paginated content on the
server would have to be stored and appropriate links between the pages would
have to be created. This problem is even more complex than meets the eye,
because some pages may have been served to the mobile devices, and the rest will
be stored on the server.
It is crucial to note that this scenario again increases rather than decreases complexity, if the developer wants to add functionality to the patch that
paginates the content at the application and/or presentation layer. Let's assume
the developer obtains a device profile for the device making the request to
identify the optimal size of the content pages. The developer can use this
information to write custom presentation logic to paginate the content and
reduce the probability of memory overflow. From the device profile, the
developer may also choose to further customize the XML based on other device
features and attributes to improve the user experience beyond optimal content
pages. That is, the developer may personalize the application flow and content
based on how the device accesses the application, inserting ad hoc application
logic into the application and presentation layers in addition to implementing a
proprietary methodology to handle different device attributes.
The maintenance cost of this custom coding approach will be an
eye-opener. Developing a custom transformation engine that dynamically assembles
a library based on device features makes adding devices easier, but in itself
can be an extensive development project that still faces scalability and
reliability issues. The picture is further complicated in the common scenario in which the developer needs
to override the automatic transformation for a specific set of pages and
devices. For example, customizing a branding
page's size, shape, and logo color for a certain set of devices are common
requirements, and the transformation engine should allow for overriding and
providing full control over the look and feel.