Logic Programming with Perl and Prolog
by Robert Pratte
|
Now that I have loaded my Prolog database, I need to feed it some more information. I need to take my data, in Dot format, and translate it into something that my Prolog interpreter will understand. There are some modules out there that may be helpful, such as DFA::Simple, but since I can assume that my data will look a certain way--having written it from my other application--I will build my own simple parser. First, I am going to take a look at the data.
The visualization program created the diagram in Figure 1 from the code:
digraph family_tree {
{ jill [ color = pink ]
rob [ color = blue ] } -> { ann [ color = pink ]
joe [ color = blue ] } ;
{ sue [ color = pink ]
dan [ color = blue ] } -> { sara [ color = pink ]
mike [ color = blue ] } ;
{ nan [ color = pink ]
tom [ color = blue ] } -> sue ;
{ nan
jim [ color = blue ] } -> rob ;
{ kate [ color = pink ]
steve [ color = blue ] } -> dan ;
{ lucy [ color = pink ]
chris [ color = blue ] } -> jill ;
}

Figure 1. A family tree from the sample data
There are a few peculiarities worth mentioning here. First, it may seem that the all-lower-case names are a bit strange, but I am already preparing for the convention that data in Prolog is typically lower-case. Also, I inserted an extra space before the semicolons in an effort to make matching them easier. While both of these conventions are easy to code around, they seems to create extra questions when illustrating a point. Therefore, assume that the above Dot snippet illustrates the range of possible formats in the example. While the "real-world examples" may provide a richer set of possibilities, the fact that applications with defined behavior generated this data will limit the edge cases.
Returning to the data, it will be easiest to parse the Dot data using a simple state machine. Previously, I had defined some constants to represent states:
use constant { modInit => 0,
modTag => 1,
modValue => 2 };
Basically, I assume that anything on the left-hand side of the = is a parent and anything on the right is a child. Additionally, modifiers (in this case only color) begin with a left square-bracket and males have the blue modifier, whereas females are pink. I know that I have completed a parent-child relationship "block" when I hit the semicolon. Past these stipulations, if it isn't a character I know that I can safely ignore, then it must be a noun.
sub parse_dotFile {
##----------------------------------------
## Examine data a word at a time
##----------------------------------------
my @dotData = split( /\s+/, shift() );
my ( $familyBlock, $personName, @prologQry ) = ();
my $personModPosition = modInit;
my $relationship = 'parent';
for ( my $idx = 3; $idx < @dotData; $idx++ ) {
chomp( $dotData[$idx] );
SWITCH: {
## ignore
if ( $dotData[ $idx ] =~ /[{}=\]]/ ) {
last SWITCH; }
## begin adding attributes
if ( $dotData[ $idx ] eq '[' ) {
$personModPosition = modTag;
last SWITCH; }
## switch from parents to children
if ( $dotData[ $idx ] eq '->' ) {
$relationship = 'child';
last SWITCH; }
## end of this block
if ( $dotData[ $idx ] =~ /\;/ ) {
##-----------------------------------------
## Generate is_parent rules for Prolog
##-----------------------------------------
foreach my $parentInBlock ( @{ $familyBlock->{ parent } } ) {
foreach my $childInBlock ( @{ $familyBlock->{ child } } ) {
push( @prologQry,
"is_parent(${parentInBlock}, ${childInBlock})" );
}
}
$familyBlock = ();
$relationship = 'parent';
last SWITCH; }
## I have a noun, need to set something
else {
## I have a modifier tag, next is the value
if ( $personModPosition == modTag ) {
$personModPosition = modValue;
last SWITCH;
} elsif ( $personModPosition == modValue ) {
##--------------------------------------
## Set modifier value and reset
## We currently assume it is color
##--------------------------------------
if ( $dotData[ $idx ] eq 'blue' ) {
push( @prologQry, "is_male(${personName})" );
} else {
push( @prologQry, "is_female(${personName})" );
}
$personModPosition = modInit;
$personName = ();
last SWITCH;
} else {
##--------------------------------------
## Grab the name and id as parent or child
##--------------------------------------
$personName = $dotData[ $idx ];
push( @{ $familyBlock->{ $relationship } }, $personName );
}
}
}
}
return( \@prologQry );
}
Rather than simply pushing my new rules into the Prolog interpreter directly, I return an array that contains the full ruleset. I am doing this so that I can easily dump it to a file for troubleshooting purposes. I can simply write the rules to a file, and consult this file in a Prolog shell.
With a subroutine to parse my Dot file into Prolog rules, I can now push those rules into the interpreter:
##-------------------------------------------
## Read in Dot file containing relations
## and feed it into the Prolog instance
##-------------------------------------------
open( DOTFILE, 'family_tree.dot' ) or die "$! \n";
my $parsedDigraph = parse_dotFile( <DOTFILE> );
close( DOTFILE );
foreach ( @$parsedDigraph ) {
$prologDB->do("assert($_).");
}
Now I can easily query my Prolog database using the query method in AI::Prolog:
##-------------------------------------------
## Run the query
##-------------------------------------------
$prologDB->query( "is_cousin(joe, sara)." );
while (my $results = $prologDB->results) { print "@$results\n"; }
What Next?
Even though this is a trivial example, I think that it provides an idea of the powerful ways in which Perl can be supplemented with Prolog. Just within the context of evaluating genealogical data (a mainstay of Prolog tutorials and examples), it seems that a Perl/Prolog application that uses genealogical data from open source genealogical software or websites would be a killer application. The possibilities seem endless: rules based upon Google maps, mining information from online auctions or news services, or even harvesting information for that new test harness are all tremendous opportunities for the marriage of Perl and Prolog.
Prev [1] [2]