Managing Rich Data Structures

Managing Rich Data Structures
by Dave Baker |

A Persistent Hash of Hashes

Now I had a hash of hashes, but could I store it into a database? How could I use that database later to pull a particular banner's data from the stored hash? Simple text files won't work for a hash data structure, the way they can work for scalars (one value per file) or lists (one value per line, or multiple values per line separated by a pipe character or other unusual delimiter). By avoiding text files in favor of another kind of data storage, I hoped to avoid having to "open, slurp, close" three times for each banner or have my script perform directory listings to see whether a particular day's newsletter had a banner.

My solution largely came from Recipe 14.6 of Perl Cookbook, 2nd Edition. That recipe was not as specific in its example as I needed, though, so I wrote this article to share the details that I learned how to fill in. (I later learned that Recipe 11.14 provides most of the missing details.)

Here's how to store the hash of hashes using a DBM file (don't include the line numbers in any actual code, if you're cutting and pasting along at home):

=0=  #!/usr/local/bin/perl

=1=  use strict;

=2=  use warnings;

=3=  use MLDBM qw( DB_File Storable );

=4=  use Fcntl;

=5=  my $db = '/www/cgi-bin/databases/ad_data.db';



=6=  my %data_for_ad_on;



=7=  tie %data_for_ad_on, 'MLDBM', $db, O_CREAT|O_RDWR, 0644

         or die "Trouble opening $db, stopped: $!";



=8=  [here, paste the multi-line %data_for_ad_on statement listed earlier]

And that's it! When the script completes, it will have assigned the value of %data_for_ad_on and then saved it to a new DBM file named /www/cgi-bin/databases/ad_data.db (or any file you like, if your script has permission to write to the specified directory).

The secret is in the tie function. It associates a particular hash (%data_for_ad_on) with a class and a file. The class that works for complex data--data that contains references--is the MLDBM module available from the CPAN.

The Fcntl module facilitates the creation or updating of the database by the script, by importing the O_CREAT|O_RDWR parameters that tell the script to create the database file if it doesn't yet exist or to update (read/write) the file if it exists already.

The DB_File and Storable parameters passed to the MLDBM module indicate the ways in which to store the data on disk, including the behind-the-scenes conversion of the references into strings.

If you have hundreds of text files like I did, undoubtedly you don't look forward to hand-coding that information into a hash as shown above. In fact, it's not hard to write a short script to convert your data into the needed hash (and then MLDBM) format. Here's how I did it:

#!/usr/local/bin/perl -T



use warnings;

use strict;



use Fcntl qw( :flock O_CREAT O_RDWR );



my %data_for_ad_on;

my $dbm_filename = '/www/cgi-bin/databases/data_for_ad_on.db';



use MLDBM qw( DB_File Storable );



tie %data_for_ad_on, 'MLDBM', $dbm_filename, O_CREAT|O_RDWR, 0644

    or die "Can't open $dbm_filename: $!";



my $data_dir = '/www/cgi-bin/databases/ad';



opendir my $dh, $data_dir

    or die "Can't open $data_dir, stopped: $!";



my @files = grep { /\d\d\d\d_\d\d_\d\d\.txt$/     } readdir $dh;



# Because Perl's tie mechanism doesn't let us modify parts of an MLDBM value

# directly, we have to get, change and set pieces of the stored structure

# through a temporary variable ($entry).



foreach my $file (@files) {



    if ($file =~ /^url_(\d\d\d\d_\d\d_\d\d)$/ ) {

        my $entry = $data_for_ad_on{$1};                   # Get



        open (FILE, "$data_dir/$file")

            or die "Couldn't open $data_dir/$file: $!";

        flock FILE, LOCK_SH

            or die "Can't flock $data_dir/$file: $!";

        my $url =  do { local $/; <FILE> };

        $url =~ s/^\s+//g;   # So long, leading whitespace

        $url =~ s/\s+$//g;   # So long, trailing whitespace

        close FILE;



        $entry->{url}       = $url;                        # Change

        $data_for_ad_on{$1} = $entry;                      # Set

        print "Just set target URL for $1\n";

    }

    elsif ($file =~ /^gif_(\d\d\d\d_\d\d_\d\d)$/ ) {

        my $entry = $data_for_ad_on{$1};                   # Get



        open (FILE, "$data_dir/$file")

            or die "Couldn't open $data_dir/$file: $!";

        flock FILE, LOCK_SH

            or die "Can't flock $data_dir/$file: $!";

        my $gif =  do { local $/; <FILE> };

        $gif =~ s/^\s+//g;   # So long, leading whitespace

        $gif =~ s/\s+$//g;   # So long, trailing whitespace

        close FILE;



        $entry->{gif_URL}   = $gif;                        # Change

        $data_for_ad_on{$1} = $entry;                      # Set

        print "Just set location URL of banner for $1\n";

    }

    elsif ($file =~ /^headline_(\d\d\d\d_\d\d_\d\d)$/ ) {

        my $entry = $data_for_ad_on{$1};                   # Get



        open (FILE, "$data_dir/$file")

            or die "Couldn't open $data_dir/$file: $!";

        flock FILE, LOCK_SH

            or die "Can't flock $data_dir/$file: $!";

        my $headline = do { local $/; <FILE> };

        $headline =~ s/^\s+//g;   # So long, leading whitespace

        $headline =~ s/\s+$//g;   # So long, trailing whitespace

        close FILE;



        $entry->{headline}  = $headline;                   # Change

        $data_for_ad_on{$1} = $entry;                      # Set

        print "Just set headline for $1\n";

    }

}

After the script runs, it will have converted all of the data into pieces of the mother hash being tied to our database. When the script quits, it automatically stores the data in the database and unties the hash.

First, the code opens the directory where I've stored all of my small text files. It then puts into the @files array all those filenames that have a particular date-type sequence and end in ".txt." This picks up my url_2005_12_09.txt, gif_2005_12_09.txt, and headline_2005_12_09.txt files for December 9, 2005, for example, and the three similar files for each other date that has data files in the directory.

Prev  [1] [2] [3] Next

Close    To Top
  • Prev Article-Programming:
  • Next Article-Programming:
  • Now: Tutorial for Web and Software Design > Programming > Perl > Programming Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction