So far, this series has concentrated on configuration options available via
the graphical interface of IIS Virtual Directory Management for SQL Server.
However, newer versions of SQLXML, in addition to the features we have already
covered, provide another type of functionality, accessible through scripting,
that allows bulk loading of XML data into SQL databases. Such functionality
will be the topic of this article. For this, you should install the most recent
release of SQLXML (version 3.0 currently at Service Pack 1 level, available for
download from the
Microsoft Web Site).
Mechanisms for modifying SQL databases using XML formatted data, such as the
previously described updategrams,
are not suitable for large amounts of data, because they require the entire XML
document to be loaded into memory prior to initiating the insert operation.
Bulk loading of XML data (equivalent to bulk loading using standard SQL server
methods as Bulk Copy Program or BULK INSERT) provides a more efficient method
of dealing with such situations.
XML-based bulk load operation is implemented as the XML Bulk Load COM
component. You can take advantage of this component through use of programming
(including .NET applications) or scripting (e.g. via VBScript, which will be
the method presented here). The XML Bulk Load component uses data in XML format
stored in an XML document (or an XML fragment) and an annotated mapped schema
in XDR (XML Data Reduced) or XSD (XML Schema Definition) format (which provides
XML data description and verification).
As you might recall from our earlier discussions, XDR schema has been
supported (as the de-facto standard) since the release of SQL Server 2000 (and
accompanying SQLXML 1.0). Once the XML Schema Definition standard had been
ratified (and recommended) by the World Wide Web Consortium, Microsoft included
it in subsequent releases of SQLXML (2.0 and 3.0). If you intend to work with
XSD schemas, you can convert XDR schema examples presented in this article to
XSD format using CVTSCHEMA.EXE included with SQLXML 3.0 (you can find this file
in Program Files\SQLXML 3.0\bin folder).
To better understand the bulk load mechanism, let's take a look at a fairly
simple sample of code that inserts XML-formatted data into the Shippers table
of the Northwind database.
Set oXMLBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
oXMLBulkLoad.ConnectionString = "PROVIDER=SQLOLEDB;SERVER=YourSQLServer;" & _
"DATABASE=Northwind;INTEGRATED SECURITY=sspi;"
oXMLBulkLoad.Execute "C:\XMLData\Shippers.xdr", "C:\XMLData\Shippers.xml"
Set oXMLBulkLoad = Nothing
As you can see, first we instantiate an object (in our example, called oXMLBulkLoad)
of type SQLXMLBulkLoad - this class is represented in the HKEY_CLASSES_ROOT
portion of the registry by SQLXMLBulkLoad.SQLXMLBulkLoad entry (referred to as ProgID),
hence this is the parameter used by the CreateObject method. Next, we specify
the connection string property of the object we just created. The connection
string contains the provider name (SQLOLEDB), server name (make sure you
replace the value "YourSQLServer" with the name of your SQL server),
database name (Northwind in this case) and type of authentication used for the
connection (Windows integrated). The actual bulk insert takes place when the
Execute method of the oXMLBulkLoad object is invoked. The method takes two
parameters - full paths of the files containing a schema - in XDR or XSD format
- and corresponding XML document, respectively. If you save the script as BulkLoadShippers.vbs,
you could execute it by running
cscript BulkLoadShippers.vbs
at the Command Prompt or simply double-clicking the file
from Windows Explorer.
However, before you do this, you need to ensure that you have properly
formatted schema and XML-document files, available in the locations specified
in the script. Our XML document will have the following format:
<?xml version="1.0" encoding="utf-8" ?>
<Shippers>
<Shipper ShipperID="4" CompanyName="Speedy Gonzalez" Phone="(503) 555-9934" />
<Shipper ShipperID="5" CompanyName="The RoadRunner" Phone="(503) 555-3456" />
</Shippers>
which corresponds to the following XDR schema:
<Schema xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes"
xmlns:sql="urn:schemas-microsoft-com:xml-sql">
<ElementType name="Shipper" sql:relation="Shippers">
<AttributeType name="ShipperID" required="yes" dt:type="int"/>
<AttributeType name="CompanyName" required="yes" dt:type="string"/>
<AttributeType name="Phone" required="no" dt:type="string"/>
<attribute type="ShipperID"/>
<attribute type="CompanyName"/>
<attribute type="Phone"/>
</ElementType>
</Schema>
Our XDR schema is rather straightforward and can be created without major
effort, but constructing more elaborate XDR schemas tends to be a mundane and
time consuming task (except for very simple data structures). Fortunately, you
can simplify this process by using the XML View Mapper utility, freely downloadable
from the
Microsoft Web site.
After the installation, the utility is available from All Programs ->
Microsoft SQL Server XML Tools menu. The first time you launch it, you will be
prompted to create a new project. Once you specify its name and location, you
will be presented with the XML View Mapper interface. The interface is divided
into two main windowpanes - Map Editor with three subpanes on the left and
Project Editor on the right. From here, you can generate XDR schemas based on
SQL database tables and add annotations to them. Start by clicking on the
leftmost subpane of Map Editor (labeled "Click to import an SQL
schema") and fill out entries in the Data Link Properties dialog box that
define a connection to the target server and database. Clicking on the OK
button in this initial dialog box, will bring up the next one, titled New
Database Tables, from which you can select tables and views for which you
intend to create mapping schema. This will bring back the Map Editor window,
this time with the tables you selected showing in the leftmost subpane. Next,
select the "Generate XDR module..." option from the Tools ->
Utilities menu. This will automatically generate an item under XDR node in the
Project Editor, corresponding to the tables displayed in the Map Editor
leftmost subpane. To generate a mapping schema for this XDR module, drag it to
the rightmost subpane of the Map Editor. The mapping will be displayed as a new
item under Map Modules node in the Project Editor. You can export XDR schema
using the "Export XDR Schema" option from the Tools menu.
The steps described above introduced you to the most basic features of XML
Map Viewer. Note, though, that the tool is much more powerful. For example, you
can generate XDR mapping schemas representing multi-table relationships (by
dragging and dropping respective columns between SQL Module and XDR Module subpanes
of the Map Editor) or run tests by submitting XPath queries against the
mappings. You can also generate XDR schema from an XML document or convert
schema from DTD (Document Type Definition - schema definition mechanism used in
early versions of XML) to XDR format.
Now that we are familiar with simpler ways of generating XDR schema, let's
go back to our original script and review the different options available when
bulk importing XML data into SQL Server databases. The SQLXMLBulkLoad object we
created has a number of properties that can be used to modify the default bulk
load behavior. Among the most relevant are:
CheckConstraints - when set to False, prevents check for
constraints during the data load. With large amounts of data, this might
significantly speed up the load process.
ErrorLogFile - when set to a valid file system path, it causes
creation of an XML-based error log that can be analyzed to determine reason for
bulk load failures.
ForceTableLock - when set to True, locks the entire table for the
duration of the bulk insert, which typically speeds up the load process.
KeepIdentity - when set to False, SQL Server will create identity
value during data import. Otherwise (also by default), values specified in the XML
document, will be used for this purpose.
KeepNulls - when set to True, forces SQL server to insert a NULL
value for unmapped columns (ones for which values are not included in XML
document), even if default, non-NULL is defined for them.
SchemaGen - when set to True, causes creation of tables (if they
do not already exist) during the import process. You can also drop existing
tables by setting SGDropTables property to True.
To take advantage of any of these properties, you need to set its value
(typically to either True or False) prior to invoking the Execute method). For
example, if you want to ignore check for constraints during bulk loading, you
would use the following code:
Set oXMLBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad")
oXMLBulkLoad.ConnectionString = "PROVIDER=SQLOLEDB;SERVER=YourSQLServer;" & _
"DATABASE=Northwind;INTEGRATED SECURITY=sspi;"
oXMLBulkLoad.CheckConstraints = False
oXMLBulkLoad.Execute "C:\XMLData\Shippers.xdr", "C:\XMLData\Shippers.xml"
Set oXMLBulkLoad = Nothing
In this article, I presented an overview of bulk load
capabilities of COM object included with the SQLXML version 3.0. In the next
article of this series, we will look at another method of modifying SQL Server databases,
called diffgrams, which has been introduced after the release of SQL Server
2000.
»
See All Articles by Columnist Marcin Policht