Lisp and Java
by Dan Milstein
03/24/2004
Why learn a new programming language? Among other excellent reasons
(such as good, old-fashioned intellectual curiosity), there's the
opportunity to pick up useful techniques, tricks, and idioms that you can
apply in your day-to-day programming life. At its best, studying a new
language can give you the kind of conceptual shift that illuminates thorny
problems in a new light. Even if your mainstream language of choice doesn't
provide the special-purpose syntax that you find in a language you're
exploring, you can often find a way to implement the underlying technique
in a useful manner.
In this article, we're going to steal an idea from one of the most
theft-worthy languages out there: Lisp. We're going to pick out one of its
most useful features -- the ability to treat functions as data -- and talk
about how to apply this feature, in a slightly different form, in Java. In the
course of doing so, we'll give a very (very) brief introduction on how to read
Lisp code. We'll also develop a small but useful library for JDBC and
collections programming that you are welcome to use, abuse, and extend as
you see fit. We're going to use the Scheme dialect of
Lisp for our discussion, because it expresses the ideas we're interested in
in a particularly clear and elegant way.
Lisp Code and How To Read It
(this (is what)
(lisp code
(looks)
(like (more (or less)))))
A casual observer of Lisp code will notice the dazzling collection of
parentheses, without a heck of a lot else to visually break up the code.
Where in a language with C-descended syntax you get parentheses, curly
braces, commas, colons, semicolons, and a whole horde of other syntactical
signposts to keep you on your way, in Lisp, you get pretty much nothing but
( and its friend ). All other tokens are
separated by white space, and indentation, though extremely important
culturally, has no significance as part of the language itself.
For example, in Lisp, if you want to, say, call a function that doubles a
number, you write:
(double 7)
Side note: when discussing Lisp, it's common to show the result of evaluating an expression as follows:
(double 7)
=> 14
Which can be read as "(double 7) evaluates to 14."
In Java, there are basically two ways you can call a function: as an
instance method or as a class method. For the former, you'd have something
like:
aNumberSeven.double()
For the latter:
aClass.double(7)
For both of these cases in Java, you learn to see the word just before
the opening parenthesis as "hot" -- it's the verb of the phrase you're reading.
One nice thing about Java's instance method-call syntax is that makes it
very clear what the subject of that verb is (by matching the English
language's subject-verb-object order). In Lisp, the word immediately
after the opening parenthesis is what you learn to read as the verb (in
most cases), and there is no distinguished subject of that verb. This
pattern is used for functions that are infix operators in other languages,
such as plus and minus:
(+ 5 2)
=> 7
Or assignment (which has an undefined value, so we won't show it
evaluating to anything):
(define a 11)
which is more or less equivalent to:
int a = 11;
Note that Lisp doesn't specify the type of variable a
-- in Lisp, a variable can hold any type.
The intense regularity to the syntax can be forbidding. Parenthetical
digression: the payoff comes through the ability to represent Lisp programs
as Lisp data, which means that Lisp programmers can easily write programs
to manipulate other programs. Although that is indeed a trick worth
stealing or even spending a lifetime studying, it would take more of a book
than an article to explore it fully. For such a book, check out Paul
Graham's ANSI Common Lisp
or On Lisp (available
as a PDF on his web site), which contain many inspiring uses of the mighty
Lisp macro. End parenthetical digression.
Moving on through the rudiments of the language: the basic data structure
in Lisp is the list, which can be created in a variety of ways, one of the
simplest of which is via the list function.
(define lst (list 2 3 5 7 11))
lst => (2 3 5 7 11)
Treating Functions as Data In Lisp
Okay, now that we've got basic Lisp syntax and Lisp data structures
under our belts, we're ready to talk about the idea we're going to steal:
in Lisp, functions are first-class objects, meaning they have all
the "rights and privileges" accorded to other objects in Lisp. You can
name them with variables, pass them into functions, return them from
functions, and store them in data structures. They are no different from
"basic" data. This facility turns out to be enormously powerful.
As a simple example, you can define a function that takes another
function as an argument. Such a super function is known as a
higher-order function. If this is a new concept, the key thing to
understand is that you can, in Lisp, refer to a function without
calling it, much as you can, in Java, refer to a variable without
immediately using its value. In fact, in the Scheme dialect, the syntax
for function and variable reference is identical. To show this, we'll
introduce our first big higher-order function, map. The
map function takes another function and a list as arguments and creates a new
list by applying the function to each element of the original list, like
so:
(define lst (list 2 3 5 7 11))
(map double lst)
=> (4 6 10 14 22)
In Java syntax, that'd look something like:
int[] lst = [ 2,3,5,7,11 ];
int double(int x) { return x + x; }
int[] dbls = map(double, lst);
Which looks great, but is, unfortunately, nonsense. In Java, you can't
pick up a method and pass it into another method. There is no simple way
to refer to the method at runtime -- you can only invoke it. You
can't easily pass it to another function, return it from a function, or
store it in a data structure. Through complex reflection tricks, these
things can be done, but it's not for the faint of heart. In Lisp, you can
do all of these things quite simply, and, in Lisp culture, you do them all
the time.
Why First-Class Functions Are Useful
One place where first-class functions come in particularly handy is in
dealing with collections. It's very common to have a collection of stuff
and to want to produce another collection of stuff. You'd like to ignore
the nitty-gritty of the looping code and just keep your mind on the
collection as a whole. A higher-order function such as map lets
you do just that.
An example: if you've done any work with aggregate data in Java, you've
gotten used to seeing your ideas disappear under a haze of iterator code.
In particular, one type of aggregate data that is tough to handle cleanly
is the JDBC ResultSet. In my day-to-day life, I write a lot of
Java code that pulls data out of a database and then displays it on a web
page. For all of you JDBC fans, the following type of code probably looks
all too familiar:
String query = "SELECT first_name, last_name, user_id " +
"FROM users";
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery(query);
ArrayList users = new ArrayList();
while(rs.next()) {
String fname = rs.getString("first_name");
String lname = rs.getString("last_name");
String uid = rs.getString("user_id");
users.add(new User(fname, lname, uid));
}
To a Lisp programmer, it would be natural to break down the above chunk of
code into something that generates a list of rows, and a function that is
then mapped over each row. How can we do something similar in Java?