Open XML standard for industrial and commercial catalogs

-- Draft --
Jean-Marc Vanel

2001-06-26

Introduction

The aim of this report is to define OX-SICC, an open standard for industrial catalogs, using state-of-the-art XML standards and other open standards, with these guidelines :

Review of existing DTD/XML schema for catalogs

Requirements for XML formats for industrial catalogs

Advantages of using W3C's XML Schema for catalogs

Outline of the solution

A category is simply an XML Schema for the corresponding items. A catalog is a simple container element (<catalog>) for the items. The hierarchy of categories is defined by an XML Schema <extension> of the parent category. Validation of items with respect to the pertinent category is just standard XML Schema validation. The unique identifier of a category is the URI of the corresponding schema. The key of an item within the catalog document is given by its <supplier_part_number> element, while the URI unique identifier of the item is composed in the standard way from the catalog document URI and the local key, e.g.:

http://www.IndustrySuppliers.com/catalogs/imperator/2001#xpointer(//item[supplier_part_number='122'])

The examples given in the Annexes are short extracts from the real Schemas. We provide hereafter hints for the following features :

URI naming schemes

We recall that these URI are universally unique identifiers, not necessarily retrievable on the Web. they can be used to make references to the objects in a lot of contexts.
This concerns companies and organizations, categories and items.
For companies and organizations, the main Web home page address can be used, without a terminal / , e.g.:
http://www.ibm.com

As we said, the unique identifier of a category is the URI of the corresponding schema. The naming scheme will be composed of:

Example:
http://www.IndustrySuppliers.com/catalog/2001/01/gasket

For items, a URI will be made using :

Example:
http://www.IndustrySuppliers.com/catalogs/imperator/2001#xpointer(//item[supplier_part_number='122'])

Issues

Compatibility with existing standards

TO COMPLETE <<<<<<<<<<<<<

Annex 1 - Schema for base category

Note that we use <xsd:unique> to specify uniqueness of <item> elements with keys <supplier_part_number> and  <supplier> sub-elements.

The actual schema will contain more details, but this is a short (but valid) sample to show the essentials.

<xsd:schema targetNamespace="http://www.IndustrySuppliers.com/catalog/2001/01"
  xmlns:isc="http://www.IndustrySuppliers.com/catalog/2001/01"
  xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
  elementFormDefault="qualified"
  attributeFormDefault="qualified"
>
  <xsd:complexType name="category">
    <xsd:annotation>
      <xsd:documentation>Template for common and mandatory information about a catalog item.
      </xsd:documentation>
    </xsd:annotation>
    <xsd:all>
      <xsd:element name="supplier" type="xsd:string" />
      <xsd:element name="supplier_part_number" type="xsd:string"/>
      <xsd:element name="description" type="xsd:string"/>
      <xsd:element name="long_description" type="xsd:string" />
    </xsd:all>
  </xsd:complexType>

  <xsd:complexType name="catalog">
    <xsd:sequence>
      <xsd:element name="item" type="isc:category" maxOccurs="unbounded"/>
    </xsd:sequence>
  </xsd:complexType>

  <xsd:element name="catalog" type="isc:catalog" >
   <xsd:unique name="supplier_part_number">
    <xsd:selector xpath="item"/>
    <xsd:field xpath="supplier"/>
    <xsd:field xpath="supplier_part_number"/>
   </xsd:unique>
  </xsd:element>

</xsd:schema>

Annex 2 - Schema for derived category

<xsd:schema targetNamespace="http://www.IndustrySuppliers.com/catalog/2001/01/gasket"
   xmlns    ="http://www.IndustrySuppliers.com/catalog/2001/01/gasket"
   xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"
   xmlns:isc="http://www.IndustrySuppliers.com/catalog/2001/01"
   elementFormDefault="qualified"
>
  <xsd:import namespace="http://www.IndustrySuppliers.com/catalog/2001/01"
                     schemaLocation="catalog-base.xsd" />

  <xsd:element name="gasket" type="gasket">
    <xsd:annotation>
      <xsd:documentation>
        IndustrySuppliers.com's definition of a gasket item
        belonging to an industrial catalog.</xsd:documentation>
    </xsd:annotation>
  </xsd:element>

  <xsd:complexType name="gasket">
    <xsd:complexContent>
      <xsd:extension base="isc:category">
        <xsd:all>
          <xsd:element name="diameter" type="xsd:float" minOccurs='1' />
        </xsd:all>
      </xsd:extension>
    </xsd:complexContent>
  </xsd:complexType>

</xsd:schema>

Annex 3 - Sample XML instance of industrial catalog

Note that we have here an instance of a base catalog, which can contain any type of item. We could have derived a special type of catalog (e.g. gasket:catalog) which would contain only gaskets.

<?xml version='1.0' encoding='ISO-8859-1'?>
<catalog
xmlns       ='http://www.IndustrySuppliers.com/catalog/2001/01'
xmlns:gasket='http://www.IndustrySuppliers.com/catalog/2001/01/gasket'
xsi:schemaLocation='http://www.IndustrySuppliers.com/catalog/2001/01/gasket
                    gasket.xsd'
xmlns:xsi='http://www.w3.org/2000/10/XMLSchema-instance'
>
  <item xsi:type='gasket:gasket'>
    <supplier>Imperator</supplier>
    <supplier_part_number>122</supplier_part_number>
    <long_description>Very good indeed!</long_description>
    <description>Very good!</description>
    <gasket:diameter>1.22</gasket:diameter>
  </item>
</catalog>

Annex 4 - Glossary and acronyms

Word / acronym Definition URL
attribute see descriptor
descriptor  also called attribute
DOM Document Object Model http://www.w3.org/DOM/
Dublin Core basic metadata specification; metadata in the sense of "data about data", that is information such as Author, Title, Date; this different from the other meaning of specification of structure of data, such as database schema. http://purl.org/DC/
Schematron XSLT - based technique to check XPath-based rules and report anomalies; its strength is in its ability to enforce rules involving comparison of different elements. http://www.ascc.net/xml/resource/schematron/
tutorial:
http://www.zvon.org/xxl/SchematronTutorial/General/contents.html
UML Unified Modeling Language
URI Uniform Resource Identifier http://www.w3.org/Addressing/
W3C World Wide Web Consortium http:www.w3.org
XPointer
XSL, XSLT XML Stylesheet Language (for Transforms) 
XML Schema World Wide Web Consortium's Schema language: manages the "basic" validation: structure of elements and sub-elements, constraints on the content of elements, uniqueness and key references.

Annex 5 - References

See also hyperlinks in Annex 4 - Glossary and acronyms.
Industrial catalogs management - software specification , J.M. Vanel

XML Schema specification:
http://www.w3.org/TR/xmlschema-0/
http://www.w3.org/TR/xmlschema-1/
http://www.w3.org/TR/xmlschema-2/
XML Schema Tutorial

Command-line tool for validating with XML Schema: XSV
Current Status of XSV: Coverage, Known Bugs

Schematron specification:
http://www.ascc.net/xml/resource/schematron/
Schematron tutorial:
http://www.zvon.org/index.php?nav_id=2
TO COMPLETE <<<<<<<<<<<<<

Annex 6 - expressing validaty rules inside the XML Schema

Here is an example showing how one can restrict the supplier_part_number to follow the pattern: 3 digits, dash, 4 digits. This XML can be put inside the <xsd:element name="supplier_part_number"> element.

        <xsd:simpleType>
         <xsd:restriction base="xsd:string">
          <xsd:length value="8"/>
          <xsd:pattern value="\d{3}-\d{4}"/>
         </xsd:restriction>
        </xsd:simpleType>