Now: Tutorial for Web and Software Design > PHP > PHP Basic > PHP Content
> Building a Simple Search Engine with PHP [Bookmark it]
Building a Simple Search Engine with PHP

Building a Simple Search Engine with PHP

by Daniel Solin
10/24/2002

A little while ago, I was working on an intranet site for a mid-sized company. As the site grew in both size and popularity, the assigner requested me to extend the site with a search feature. Since one of the rules of the intranet was that all logic code should be written in-house, using an existing open source engine was not an option.

Within a day, the engine was quite complete, and the result actually turned out better than expected. With PHP, MySQL, and a few techniques, these small projects are very easy. This article presents a cut-down version of the search engine. I hope this will encourage you to develop an engine that suits your particular needs, with the exact features you desire.

Database Design and Logic

We'll use MySQL as a database backend to store our search data. It's possible to shell out to Unix commands such as grep and find, but that would mean running the search engine on the machine hosting the files. As well, it would be more difficult to index pages served from a database. We'll tackle the database first.

The database for the search engine consists of three tables: page, word, and occurrence. page holds all indexed web pages, and word holds all of the words found on the indexed pages. The rows in occurrence correlate words to their containing pages. Each row represents one occurrence of one particular word on one particular page. The SQL for creating these tables are shown below.

CREATE TABLE page (

   page_id int(10) unsigned NOT NULL auto_increment,

   page_url varchar(200) NOT NULL default '',

   PRIMARY KEY (page_id)

) TYPE=MyISAM;



CREATE TABLE word (

   word_id int(10) unsigned NOT NULL auto_increment,

   word_word varchar(50) NOT NULL default '',

   PRIMARY KEY (word_id)

) TYPE=MyISAM;



CREATE TABLE occurrence (

   occurrence_id int(10) unsigned NOT NULL auto_increment,

   word_id int(10) unsigned NOT NULL default '0',

   page_id int(10) unsigned NOT NULL default '0',

   PRIMARY KEY (occurrence_id)

) TYPE=MyISAM;

While page and word hold actual data, occurrence acts only as a reference table. By joining occurrence with page and word, we can determine which pages contain a word, as well as how many times the word occurs. Before that, though, we need some data.

[1] [2] [3] Next

[Bookmark][Print] [Close][To Top]
  • Prev Article-PHP:

  • Next Article-PHP:
  • Related Materias
    Important Notice for Apach
    Custom-Compiling Apache an
    Newbies Find Help in OReil
    Writing Input Filters for 
    Writing Apache 2.0 Output 
    Writing Filters for Apache
    Autofilled PHP Forms
    Improve Your Build Process
    Important Notice for PHP D
    PHP Debugging Basics
    Topics
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Graphic Design Tutorial
     

    Coreldraw Tutorial

      Illustrator Tutorial
      3D Graphics Articles
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial&Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial&Articles
     

    XML Style Tutorial

      AJAX Tutorial
      XML Mobile
    Flash Tutorial&Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial&Articles
     

    Linux Tutorial

      Symbian Tutorial
      MacOS Tutorial