Using Castor JDO for SQL Mapping
by Jeff Lowery
10/02/2002
Castor is a multifaceted software tool being developed under the auspices of exolab.org, an informal organization involved in the development of open source, enterprise software projects based on Java and XML.
The primary function of Castor is to perform data binding. Data
binding is a process that facilitates the representation of one data model in
another. For example, an XML data model, described by an XML schema document,
can be approximately represented by Java classes. Castor helps by generating
these classes from the XML schema document. Object instances of these classes
are then able to store XML document data, so long as such documents
conform to the XML schema.
Such binding works both ways, of course: object instances of the generated
classes can be easily transformed back into XML documents through a process
known as marshalling. Castor's marshalling engine can introspect
the Java data objects and generate corresponding XML document elements. Such
marshalling can be refined through the use of user-defined mapping files,
which Castor also supports.
While XML data binding is very useful, this article will focus on another
aspect of Castor: data binding of Java objects to tables, columns, and rows
in a SQL database. This functionality falls under the heading of Castor
JDO. The intent of the Java Data Objects (JDO) standard is to transparently persist Java
objects. Although similar in name and intent to Sun's efforts, Castor JDO
was developed largely independently of Sun's efforts, but aims roughly in the
same direction. Be that as it may, the two technologies have different
feature sets, and should not be assumed to be interoperable.
Castor is able to employ a mapping file as a simple way to bind Java
objects to SQL database tables. Instead of writing complex procedural code in
Java to manage the database queries and updates via JDBC, Castor hides this
complexity by using the mapping file entries and transparently performing
the proper queries and updates in the background. This declarative mechanism
of descibing how the objects and the database are linked makes maintenance
easier, because mapping files are relatively easy to understand and can be
changed without recompiling code. It's also much simpler in that it operates
at a higher level of abstraction than functionally equivalent JDBC mechanisms
allow.
Writing a Mapping File
Unlike Castor's XML marshalling -- where there is a default mapping of a
Java object to XML elements and attributes -- no default mapping exists for
binding Java objects to SQL database tables: you must use a mapping file to
enable this functionality. The mapping file contains explicit information on
how Castor should represent a set of Java objects in a relational
database.
Mapping files are written in XML, and can be validated against either a
DTD or an XML schema supplied by the Castor group. This type of validation
will catch syntax errors, but it does take some practice to understand the
concepts behind mapping files and understanding Castor's JDO behavior. If
Castor JDO appears not to be storing data in the database, it's likely due to
missing entries in an otherwise syntactically valid mapping file.
Before going into the details of mapping, let's first present an overview
of the mapping elements. For every Java class whose instances are to be
stored in the database, a class element is required in the
mapping file. These class elements reside one level below
the mapping root element:
<mapping>
<description>Optional description of mapping file</description>
<!-- you can include other mapping files -->
<include href="other/mapping_file.xml"/>
<class name="Class1" map-to="db_table1">
<description>Optional description of this class mapping</description>
<field name="field> <!-- mapping of a class data member -->
<sql> ... </sql> <!-- maps Java field to database column -->
</field>
<field> ... </field>
...
</class>
<class name="Class2" map-to="db-table2">
...
</class>
</mapping>
For each class to be mapped to a database table (the
map-to attribute indicates which table), the data members of
the class to be persisted (referred to as fields by Castor) will
have corresponding field elements. Each field element in turn has a sql child
element, which describes how such fields are stored in the database.
Once a mapping has been established, Castor takes care of the queries and
updates that are needed to fetch and store data as Java objects. It also
handles conversions between SQL and Java datatypes seamlessly.
Overview
A mapping file is written from the point of view of the Java class and
describes how the properties of each object are to be stored in database
tables, columns, and rows. These class properties are referred to as fields
in the mapping file.
The general rules for mapping Java objects to database tables are:
- Each Java class maps to no more than one database table.
- Each Java object must have a unique identifier (which may be
autogenerated).
- A database column must be identified for every field that is to be
stored in the database.
- Fields have the option of being fetched or stored directly (if they are
public), or through their class'
get/set methods.
- Classes that exist only as part of a larger composite class should be
indicated as dependent upon that composite class in the mapping file.
- Classes that extend another mapped class should be indicated as such in
the mapping file.
Database Persistence
Castor uses the XML-format mapping file to determine:
- Which classes to persist in the database, and in what table.
- Which properties of the class to persist (fields), and in what columns.
Every element and attribute of a primitive type in Java (int, boolean,
double, etc.) can be mapped to a table column that is of a comparable SQL
type. Some nonprimitive types (classes), such as java.lang.String,
java.sql.Date, java.lang.Long, etc., can be mapped to a single table column, as
well. All other Java classes must have a class element definition in
the mapping file to be persisted in a database.
Remember, Castor will not introspect a class and apply a set of
default rules to guess the fields and attempt to persist them. This is
different behavior from that of the XML marshaller.
There may be cases where a class instance only exists as a member of a
containing parent object (in other words, it makes an appearance only as
a child of some other object). The mapping file conveniently allows such
dependent objects to be denoted as such. The advantages of recording
dependencies in the mapping file will be explained later.
Fetching Java Objects From a SQL Database
When reconstructing an object from a SQL database, Castor uses the mapping
information to determine where the class' data members are stored in the
database.
It is possible to set up the mapping file so that an object doesn't have
to be pulled from the database in its entirety, but can be lazily fetched.
This means that certain members of the class instance will only be retrieved
when the appropriate get method is called. The result is that large
objects can be fetched incrementally -- avoiding the performance hit of
fetching a load of data from the database that might not be needed by the
program at that time.
Describing the Data Model
We're now ready to start in on the mechanics of how a set of Java objects
are mapped to a SQL database. For this example, we assume the following
environment:
- JDK 1.4
- Postgres v7.2.2 database, running under a Cygwin Unix shell on Windows
2000
- Castor build 9.3.19 or later
The main variable here is the database. Not all databases support the
same features, and some support similar features differently. I will point out cases where
the examples show a Postgres-specific way of doing things.