A Perl Hacker's Foray into .NET
by Simon Cozens
March 19, 2002
No, I haven't sold out; I haven't gone over to the dark side; I
haven't been bought. I'm one of the last people to be using
closed-source software by choice. But one of the traits of any
self-respecting hacker is curiosity, and so when he hears about
some cool new technology, he's almost obliged to check it out
and see whether there's anything he can learn from it. So this
particular Perl hacker took a look at Microsoft's .NET
Framework, and, well, Mikey, I think he likes it.
What Is .NET?
When something's as incredibly hyped as Microsoft's .NET
project, it's hard to convince people that there's a real
working technology underneath it. Unfortunately, Microsoft doesn't
do itself any favors by slapping the .NET moniker on
anything they can. So let's clarify what we're
talking about.
.NET is applied to anything with the broad notion of "Web
services" -- from the Passport and Hailstorm automated
privacy-deprivation services and the Web-service-enabled versions
of operating systems and application products to the C#
language and the Common Language Runtime. But there is an underlying theme and it goes like this: The .NET Framework is an environment based on
the Common Language Runtime and (to some extent) the C#
language, for creating portable Web services.
So for our exploration, the components of the .NET Framework
that we care about are the Common Language Runtime and the C#
language. And to nail it down beyond any doubt, these are things
that you can download and use today. They're real, they exist
and they work.
The .NET CLR
Let's begin with the CLR. The CLR is, in essence, a virtual
machine for C# much like the Java VM, but which is specifically
designed to allow a wide variety of languages other than C# to
run on it. Does this ring any bells with Perl programmers? Yes,
it's not entirely dissimilar to the idea of the Parrot VM, the
host VM for Perl 6 but designed to run other languages as well.
But that's more or less where the similarity ends. For starters,
while Parrot is chiefly intended to be ran as an interpreted VM
but has a "bolted-on" JIT, CLR is expected to be JITted from the
get-go. Microsoft seems to want to avoid the accusations of
slowness leveled at Java by effectively requiring JIT compilation.
Another "surface" distinction between Parrot and CLR is that the
languages supported by the CLR are primarily statically typed
languages such as C#, J#, (a variant of Java) and Visual Basic
.NET. The languages Parrot aims to support are primarily
dynamically typed, allowing run-time compilation, symbolic
variable access, (try doing ${"Package::$var"} in
C#...) closures, and other relatively wacky operations.
To address these sorts of features, the
Project
7 research project was set up to provide .NET ports for a
variety of "academic" languages. Unfortunately, it transpires
that this has highlighted some limitations of the CLR, and so
almost all of the implementations have had to modify their
target languages slightly or drop difficult features. For
instance, the work on Mercury
turned up some deficiencies in CLR's Common Type System that
would also affect a Perl implementation. We'll discuss these
deficiencies later when we examine how Perl and the .NET
Framework can interact.
But on the other hand, let's not let this detract from what the
CLR is good at - it can run a variety of different languages
relatively efficiently, and it can share data between
languages. Let's now take a look at C#, the native language of
the CLR, and then see how we can run .NET executables on our
favourite free operating systems.
C#
C# is Microsoft's new language for the .NET Framework. It shares
some features with Java, and in fact looks extremely like Java
at first glance. Here's a piece of C# code:
using System;
class App {
public static void Main(string[] args) {
Console.WriteLine("Hello World");
foreach (String s in args) {
Console.WriteLine("Command-line argument: " + s);
}
}
}
Naturally, the Java-like features are quite obvious to anyone
who's seen much Java - everything's in a class, and there's an
explicitly defined Main function. But what's this -
a Perl-like foreach loop. And that
using declaration seems strangely familiar.
Now, don't get me wrong. I'm not trying to claim that C# is some
bastard offspring of Perl and Java, or even that C# really has
that much in common with Perl; it doesn't. But it is a
well-designed language that does have a bunch of
"programmer-friendly" language features that traditionally made
"scripting" languages like Perl or Python faster for rapid code
prototyping.
Here's some more code, which forms part of a game-of-life
benchmarking tool we used to benchmark the CLR against Parrot.
static String generate(String input) {
int cell, neighbours;
int len = input.Length;
String output = "";
cell = 0;
do {
neighbours = 0;
foreach (int offset in new Int32[] {-16, -15, -14, -1, 1, 14, 15, 16}) {
int pos = (offset + len + cell) % len;
if (input.Substring(pos, 1) == "*")
neighbours++;
}
if (input.Substring(cell, 1) == "*") {
output += (neighbours < 2 || neighbours > 3) ? " " : "*";
} else {
output += (neighbours == 3) ? "*" : " ";
}
} while (++cell < len);
return output;
}
This runs one generation of the game
of life, taking an input playing field and building an
output string. What's remarkable about this is that I wrote it
after a day of looking at C# code, with no prior exposure to
Java. C# is certainly easy to pick up.
What can Perl learn from C#? That's an interesting question,
especially as the Perl 6 design project is ongoing. Let's have a
a quick look at some of the innovations in C# and how we might
apply them to Perl.
Strong Names
We'll start with an easy one, since Larry has already said that
something like this will already be in Perl 6: To avoid
versioning clashes and interface incompatibilities, .NET has the
concept of "strong names." Assemblies -- the C# equivalent of
Java's jar files -- have metadata containing their
name, version number, md5sum and cryptographic signature,
meaning you can be sure you're always going to get the
definitions and behavior you'd expect from any third-party code
you run. More generally, assemblies support arbitrary metadata
that you can use to annotate their contents.
This approach to versioning and metadata in Perl 6 was
highlighted in Larry's State of
the Onion talk this year, and is also the solution used by
JavaScript 2.0, as described by Waldemar Horwat at his LL1
presentation, so it seems to be the way the language world
is going.
Properties
C# supports properties, which are class fields with explicit
get/set methods. This is slightly akin to Perl's tying, but
much, much slicker. Here's an example:
private int MyInt;
public int SomeInt {
get {
Console.WriteLine("I was got.\n");
return MyInt;
}
set {
Console.WriteLine("I was set.\n");
MyInt = value;
}
}
Whenever we access SomeInt, the get
accessor is executed, and returns the value of the underlying
MyInt variable; when we write to it, the
corresponding set accessor is called. Here's one
suggested way we could do something similar in Perl 6:
my $myint;
our $SomeInt :get(sub{ print "I was got!\n"; $myint })
:set(sub{ print "I was set!\n"; $myint = $^a });
C# actually takes this idea slightly further, providing
"indexers", which are essentially tied arrays:
private String realString;
public String substrString[int idx] {
get {
return realString.Substring(idx, 1);
}
set {
realString = realString(0, idx) + value + realString(idx+1);
}
}
substrString[12] = "*"; // substr($string, 12, 1) = "*";
Object-Value Duality
Within the CLR type system, (CTS) there are two distinct
types (as it were) of types: reference types and value
types. Value types are the simple, honest-to-God values:
integers, floating point numbers, strings, and so on. Reference
types, on the other hand, are objects, references, pointers and
the like.
Now for the twist: Each value type has an associated reference
type, and you can convert values between them. So, if you've got
an int counter;, then you can "box" it as an object like so:
Object CounterObj = counter. More specifically,
int corresponds to Int32. This gives
us the flexibility of objects when we need to, for instance,
call methods on them, but the speed of fixed values when we're
doing tight loops on the stack.
While Perl is and needs to remain an essentially untyped
language, optional explicit typing definitions combined with
object-value duality could massively up Perl's flexibility as
well as bringing some potential optimizations.
[1] [2] Next