Looking for Closure

Unix Power Tools
Looking for Closure

by Tim O'Reilly
02/24/2000

A common problem in text processing is making sure that items that need to occur in pairs actually do so.

Most UNIX text editors include support for making sure that elements of C syntax such as parentheses and braces are closed properly. There's much less support for making sure that textual documents, such as troff source files, have the proper structure. For example, tables must start with a .TS macro, and end with .TE. HTML documents that start a list with <UL> need a closing </UL>.

UNIX provides a number of tools that might help you to tackle this problem. Here's a shell script written by Dale Dougherty that uses awk to make sure that .TS and .TE macros come in pairs:

#! /usr/local/bin/gawk -f

BEGIN {

    inTable = 0

    TSlineno = 0

    TElineno = 0

    prevFile = ""

}

# check for unclosed table in first file, when more than one file

FILENAME != prevFile {

    if (inTable)

     printf ("%s: found .TS at File %s: %d without .TE before end of file\n",

            $0, prevFile, TSlineno)

    inTable = 0

    prevFile = FILENAME

}

# match TS and see if we are in Table

/^/.TS/ {

    if (inTable) {

        printf("%s: nested starts, File %s: line %d and %d\n",

            $0, FILENAME, TSlineno, FNR)

        }

    inTable = 1

    TSlineno = FNR

}

/^/.TE/ {

    if (! inTable)

        printf("%s: too many ends, File %s: line %d and %d\n", 

            $0, FILENAME, TElineno, FNR)

    else

        inTable = 0

    TElineno = FNR

}

# this catches end of input

END {

    if (inTable)

        printf ("found .TS at File %s: %d without .TE before end of file\n",

            FILENAME, TSlineno)

}

You can adapt this type of script for any place you need to check for something that has a start and finish.

A more complete syntax checking program could be written with the help of a lexical analyzer like lex. lex is normally used by experienced C programmers, but it can be used profitably by someone who has mastered awk and is just beginning with C, since it combines an awk-like pattern-matching process using regular expression syntax, with actions written in the more powerful and flexible C language. (See O'Reilly & Associates' lex & yacc.)

And of course, this kind of problem could be very easily tackled in perl.


Back More Unix Power Tools

 

Close    To Top
  • Prev Article-OS:
  • Next Article-OS:
  • Now: Tutorial for Web and Software Design > OS > Linux > OS Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction