Adding SALT to HTML

Adding SALT to HTML

by Simon Tang
May 14, 2003

Wireless applications are limited by their small device screens and cumbersome input methods. Consequently, many users are frustrated in their attempts to use these devices. Speech can help overcome these problems. It is the most natural way for humans to communicate. Speech technologies enable us to communicate with applications by using our voice. However, listening is slower than reading and callers have to remember all the information presented to them. Since our short-term memory is only capable of handling about 7 chunks of information, speech applications must be carefully designed.

Both wireless and speech applications have their benefits but also their limitations. Multimodal technologies attempt to leverage their respective strengths while mitigating their weaknesses. Using multimodal technologies, users can interact with applications in a variety of ways. They can provide input through speech, keyboard, keypad, touch-screen or mouse and receive output in the form of audio, video, text, or graphics.

The SALT Forum

The SALT forum is a group of vendors which is creating multimodal specifications. It was formed in 2001 by Cisco, Comverse, Intel, Microsoft, Philips and SpeechWorks. They created the first version of the Speech Application Language Tags (SALT) specification as a standard for developing multimodal applications. In July 2002, the SALT specification was contributed to the W3C's Multimodal Interaction Activity (MMI) . W3C MMI has published a number of related drafts, which are available for public review.

Objectives of SALT

The main objective of SALT is to create a royalty-free, platform-independent standard for creating multimodal applications. A whitepaper published by SALT Forum further defines six design principles of SALT.

  1. Clean integration of speech with web pages
    There is a lot of knowledge, skill, and investment in the existing web-based infrastructure. SALT relies on this investment by specifying a small set of XML elements to add speech capabilities to existing markup languages.
  2. Separation of the speech interface from business logic and data
    SALT does not alter the processing logic of the existing markup languages. It defines an independent set of elements that can be used cohesively with the existing technology.
  3. Power and flexibility of programming model
    DOM events and scripting are used to integrate SALT with existing pages. The scripting programming model provides the flexibility to add speech processing logic.
  4. Reuse existing standards for grammar, speech output, and semantic results
    Instead of reinventing the wheel of existing technologies, SALT reuses many of the existing standards.
  5. Support a range of devices
    One of the main objectives of SALT is the ability to extend many of the existing markup languages such as HTML, XHTML, cHTML, and WML. It is not restricted to any particular type of devices.
  6. Minimal cost of authoring across modes and devices
    The first five principles above result in minimizing the cost of developing, deploying and executing SALT applications.

A number of vendors, including HeyAnita, Intervoice, MayWeHelp.com, Microsoft, Philips, SandCherry and Kirusa, SpeechWorks, and VoiceWeb Solutions, have announce products, tools, and platforms that support SALT. There is also an open source project, OpenSALT, in the works to develop a SALT 1.0 compliant browser. Detailed information can be found at the SALT Forum's implementation page.

Microsoft .NET Speech SDK

Before diving into experimenting with HTML and SALT, we need to set up the appropriate development environment. I am going to use Microsoft's .NET Speech SDK 1.0. The SDK Beta 2 was released on October 30, 2002. It consists of the following components (a detailed description can be found in the Microsoft .NET Speech SDK and Platform Overview whitepaper):

  • Developer tools (for Visual Studio .NET) - Grammar Editor, Prompt Editor, ASP.NET Speech Control Editor and the Speech Debugging Console.
  • ASP .Net Speech Controls (for Visual Studio .NET)
  • Samples SALT applications
  • Documentation and tutorial on building SALT applications
  • Client add-on for Internet Explorer and Pocket IE, which can be used to run speech-enable web-pages.

The SDK can be downloaded or ordered by mail from the Microsoft Speech Technology site. You should make sure that you have meet the following requirements before beginning the installation.

  • Windows 2000 [Server] SP3, or Windows XP Pro SP1
  • Internet Information Server (IIS)
  • Internet Explorer 6.0 or later
  • .NET Framework 1.0 SP2 (Have to install .Net Framework first)
  • Visual Studio .Net (optional - if using the development tools)

Windows XP Home edition is not supported because IIS is not available. You will also need to have .NET Framework 1.0 and the SP2 installed one after the other, separately. They can be downloaded from Microsoft .NET Framework site. Make sure you do not install .NET Framework 1.1 Beta, as the .NET Speech SDK 1.0 will not work with this.

If you do not have Visual Studio .NET installed, or if you are not planning to use the developer tools, you will need to disable the Visual Studio .NET Speech Tools through the Custom Setup option.

.NET Speech SDK installation
Figure 1. .NET Speech SDK Installation

Once the installation is completed, you will find Microsoft .NET Speech SDK Beta 2 and Microsoft Internet Explorer Speech Add-in in your Programs menu.

The installation was not without problems. After the installation completed, I ran into an error with the Text-to-speech Engine (TTS). It returned error code of "-3" and gave the reason of "Internal SAPI/Prompt Engine error". After plowing through the documentation, I came across a resolution in the SDK's readme file. All I had to do was to change the default voice to one that comes from Microsoft. There are number of other "Known Issues" listed in the documentation which you should familiarize yourself with.

Adding Speech to HTML

I am going to show how we can SALT-enable a simple HTML application by hand. The best place to start is by looking at some simple HTML code.

I created a directory called salt in the default document root directory, c:\Inetpub\wwwroot\salt\ and placed the following HTML file there:

   1. <html>

   2. <head>

   3.   <title>My First HTML Application</title>

   4. </head>

   5. <body>

   6.   <h3>This is my first HTML application!</h3>

   7. </body>

   8. </html>

Unsurprisingly, this yields the following page:

Simple HTML application
Figure 2. Simple HTML page

Now, let's add a SALT element to it. We want it to speak the sentence back to us through text-to-speech (TTS). We will use <prompt>, one of the top-level elements of SALT.

    1. <html xmlns:salt="http://www.saltforum.org/2002/SALT">

    2. <head>

    3.   <title>My First Multimodal Application</title>

    4. </head>

    5. <body onload="RunIt()">

    6.   <h3>This is my first Multimodal application!</h3>

    7.   <salt:prompt id="first">

    8.     This is my first Multimodal application!

    9.   </salt:prompt>

   10. </body>

   11. <script language="javascript">

   12.   function RunIt() {

   13.     first.Start();

   14.   }

   15. </script>

   16. </html>

In line 1, we added the SALT namespace. Lines 7-9 contain the <prompt> element. It can be used for speech synthesis or to playback a recorded audio file. The attribute id="first" gives us a reference to the <prompt> element which we use in the JavaScript.

SALT relies on a scripting language to tie together events and logic between its elements and HTML elements. In our case the function RunIt() is invoked when the page is loaded. All it does is to execute the prompt and play the sentence "This is my first Multimodal application!" through the text-to-speech engine. So far, so good. When I tried to run the page, however, I did not hear anything. Instead I got the following:

Pure SALT application
Figure 3. Unexpected result from HTML + SALT page

Clicking on IE's warning icon was no help. It turns out that I need to explicitly enable the speech add-on for IE, otherwise, it will ignore all the SALT elements. All I needed to do was to add two lines (lines 2 and 3):

    1. <html xmlns:salt="http://www.saltforum.org/2002/SALT">

    2.   <object id="k-tags"

           CLASSID="clsid:DCF68E5B-84A1-4047-98A4-0A72276D19CC"

           VIEWASTEXT></object>

    3.   <?import namespace="salt"

           implementation="#k-tags"/>

    4. <head>

    5.   <title>My First Multimodal Application</title>

    6. </head>

    7. <body onload="RunIt()">

    8.   <h3>This is my first Multimodal application!</h3>

    9.   <salt:prompt id="first">

   10.     This is my first Multimodal application!

   11.   </salt:prompt>

   12. </body>

   13. <script language="javascript">

   14.   function RunIt() {

   15.     first.Start();

   16.   }

   17. </script>

   18. </html>

Now, running the application again, you should get the desired behavior. The text is displayed and spoken.

If you prefer to use recorded audio file instead of the mechanical TTS voice, you just need to replace lines 9-11 with:

<salt:prompt id="first">

  <salt:content href="hello.wav"/>

</salt:prompt>

The <content> element specifies the URL of the audio file.

Summary

In this article I introduced multimodal XML technology and specifically SALT. Using Microsoft's .NET Speech SDK, you should now be able to add SALT elements to HTML web pages. Good luck with your further investigations with SALT.

Close    To Top
  • Prev Article-XML:
  • Next Article-XML:
  • Now: Tutorial for Web and Software Design > XML > XML Mobile > XML Content
    Photoshop Tutorial
     

    Special Effect

      3D Effect
      Photoshop Articles
    Programming Tutorial
     

    C/C++ Tutorial

      Visual Basic
      C# Tutorial
    Database Tutorial
     

    MySQL Tutorial

      MS SQL Tutorial
      Oracle Tutorial
    Geek Tutorial
     

    Blogging Tutorial

      RSS Tutorial
      Podcasting Tutorial
    Graphic Design Tutorial
      Coreldraw Tutorial
      Illustrator Tutorial
      3D Tutorials
    Webmaster Articles
     

    Domain Service

      Web Hosting
      Site Promotion
    Java Tutorial/ Articles
     

    Java Servlets

      JavaEE Tutorial
     

    JavaBeans Tutorial

    XML Tutorial/ Articles
     

    XML Style

      AJAX Tutorial
      XML Mobile
    Flash Tutorial/ Articles
     

    Flash Video

      Action Script
      Flash Articles
    OS Tutorial/ Articles
      Linux Tutorial
      Symbian Tutorial
      MacOS Tutorial
    Personal Tech
      Hardware Tutorial
      Software Tutorial
      Online Auction