Java RMI: Serialization
Comparing Externalizable to Serializable
Of course, this efficiency comes at a price.
Serializablecan be frequently implemented by doing two
things: declaring that a class implements the
Serializableinterface and adding a zero-argument
constructor to the class. Furthermore, as an application evolves, the
serialization mechanism automatically adapts. Because the metadata is
automatically extracted from the class definitions, application programmers
often don't have to do anything except recompile the program.
On the other hand,
Externalizableisn't particularly easy to do, isn't very flexible, and requires you to
rewrite your marshalling and demarshalling code whenever you change your class
definitions. However, because it eliminates almost all the reflective calls
used by the serialization mechanism and gives you complete control over the
marshalling and demarshalling algorithms, it can result in dramatic
performance improvements.
To demonstrate this, I have defined the
EfficientMoneyclass. It has the same fields and
functionality as
Moneybut implements
Externalizableinstead of
Serializable:
public class EfficientMoney extends ValueObject implements Externalizable {
public static final long serialVersionUID = 1;
protected int _cents;
public EfficientMoney(Integer cents) {
this(cents.intValue( ));
}
public EfficientMoney(int cents) {
super(cents + " cents.");
_cents = cents;
}
public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
_cents = in.readInt( );
_stringifiedRepresentation = _cents + " cents.";
}
public void writeExternal(ObjectOutput out) throws IOException {
out.writeInt(_cents);
}
}
We now want to compare
Moneywith
EfficientMoney. We'll do so using the following
application:
public class MoneyWriter {
public static void main(String[] args) {
writeOne( );
writeMany( );
}
private static void writeOne( ) {
try {
System.out.println("Writing one instance");
Money money = new Money(1000);
writeObject("C:\\temp\\foo", money);
}
catch(Exception e){}
}
private static void writeMany( ) {
try {
System.out.println("Writing many instances");
ArrayList listOfMoney = new ArrayList( );
for (int i=0; i<10000; i++) {
Money money = new Money(i*100);
listOfMoney.add(money);
}
writeObject("C:\\temp\\foo2", listOfMoney);
}
catch(Exception e){}
}
private static void writeObject(String filename, Object object) throws
Exception {
FileOutputStream fileOutputStream = new FileOutputStream(filename);
ObjectOutputStream objectOutputStream = new
ObjectOutputStream(fileOutputStream);
long startTime = System.currentTimeMillis( );
objectOutputStream.writeObject(object);
objectOutputStream.flush( );
objectOutputStream.close( );
System.out.println("Time: " + (System.currentTimeMillis( ) - startTime));
}
}
On my home machine, averaging over 10 trial runs for both
Moneyand
EfficientMoney, I
get the results shown in Table
10-1. (We need to average because the
elapsed time can vary (it depends on what else the computer is doing). The
size of the file is, of course, constant.)
| Table 10-1: Testing Money and EfficientMoney |
|
Class
|
Number of instances
|
File size
|
Elapsed time
|
|
Money
|
1
|
266 bytes
|
60 milliseconds
|
|
Money
|
10,000
|
309 KB
|
995 milliseconds
|
|
EfficientMoney
|
1
|
199 bytes
|
50 milliseconds
|
|
EfficientMoney
|
10,000
|
130 KB
|
907 milliseconds
|
These results are fairly impressive. By simply converting a leaf
class in our hierarchy to use externalization, I save 67 bytes and 10
milliseconds when serializing a single instance. In addition, as I pass larger
data sets over the wire, I save more and more bandwidth--on average, 18 bytes
per instance.
TIP: Which numbers should we pay attention
to? The single-instance costs or the 10,000-instance costs? For most
applications, the single-instance cost is the most important one. A typical
remote method call involves sending three or four arguments (usually of
different types) and getting back a single return value. Since RMI clears
the serialization mechanism between calls, a typical remote method call
looks a lot more like serializing 3 or 4 single instances than serializing
10,000 instances of the same class.
If I need more efficiency, I can go further and remove
ValueObjectfrom the hierarchy entirely. The
ReallyEfficientMoneyclass directly extends
Objectand implements
Externalizable:
public class ReallyEfficientMoney implements Externalizable {
public static final long serialVersionUID = 1;
protected int _cents;
protected String _stringifiedRepresentation;
public ReallyEfficientMoney(Integer cents) {
this(cents.intValue( ));
}
public ReallyEfficientMoney(int cents) {
_cents = cents;
_stringifiedRepresentation = _cents + " cents.";
}
public void readExternal(ObjectInput in) throws IOException,
ClassNotFoundException {
_cents = in.readInt( );
_stringifiedRepresentation = _cents + " cents.";
}
public void writeExternal(ObjectOutput out) throws IOException {
out.writeInt(_cents);
}
}
ReallyEfficientMoneyhas much better
performance than either
Moneyor
EfficientMoneywhen a single instance is serialized but
is almost identical to
EfficientMoneyfor large
data sets. Again, averaging over 10 iterations, I record the numbers in Table
10-2.
| Table 10-2: Testing ReallyEfficientMoney
|
|
Class
|
Number of instances
|
File size
|
Elapsed time
|
|
ReallyEfficientMoney
|
1
|
74 bytes
|
20 milliseconds
|
|
ReallyEfficientMoney
|
10,000
|
127 KB
|
927 milliseconds
|
Compared to
Money, this is quite
impressive; I've shaved almost 200 bytes of bandwidth and saved 40
milliseconds for the typical remote method call. The downside is that I've had
to abandon my object hierarchy completely to do so; a significant percentage
of the savings resulted from not including
ValueObjectin the inheritance chain. Removing
superclasses makes code harder to maintain and forces programmers to implement
the same method many times (
ReallyEfficientMoneycan't use
ValueObject's implementation of
equals( )and
hashCode( )anymore). But it does lead to significant performance improvements.
One Final Point
An important point is that you can decide whether to implement
Externalizableor
Serializableon a class-by-class basis. Within the same
application, some of your classes can be
Serializable, and some can be
Externalizable. This makes it easy to evolve your
application in response to actual performance data and shifting requirements.
The following two-part strategy is often quite nice:
- Make all your classes implement
Serializable.
- After that, make some of them, the ones you send often
and for which serialization is dramatically inefficient, implement
Externalizableinstead.
This gets you most of the convenience of serialization and lets
you use
Externalizableto optimize when
appropriate.
Experience has shown that, over time, more and more objects will
gradually come to directly extend
Objectand
implement
Externalizable. But that's fine. It
simply means that the code was incrementally improved in response to
performance problems when the application was deployed.
View catalog information for Java RMI
Related articles:
Learning Command Objects and RMI -- O'Reilly's Java RMI author William Grosso introduces you to the basic
ideas behind command objects by providing a translation
service from a remote server and using command objects
to structure the RMI made from a client program.
Seamlessly Caching Stubs for Improved Performance -- In Part 2 of this RMI series, William Grosso addresses a common problem with RMI apps -- too many remote method calls to a naming service. In this article he extends the framework introduced in Part 1 to provide seamless caching of stubs.
Generics and Method Objects -- O'Reilly's Java RMI author William Grosso introduces you to the new Generics Specification and rebuilds his command object framework using it.
Return to ONJava.com.