XML database

XML Database

There are two major classes of XML database. Some XML databases simply perform mapping to and from relational data stored in a SQL database, effectively making them XML middleware. Native XML databases, the most common class of XML database, are databases that store XML as either textual data or using an internalized format for faster overall processing. Most Native XML databases also provide support for indexing XML which improves query performance.

The formal definition of a Native XML Database, as previously defined by the XML:DB consortium, states that a Native XML Database...

Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes, PCDATA, and document order. Examples of such models are the XPath data model, the XML Infoset, and the models implied by the DOM and the events in SAX 1.0.

Has an XML document as its fundamental unit of (logical) storage, just as a relational database has a row in a table as its fundamental unit of (logical) storage.

Is not required to have any particular underlying physical storage model. For example, it can be built on a relational, hierarchical, or object-oriented database, or use a proprietary storage format such as indexed, compressed files.

Additionally, many XML databases provide a logical model of grouping documents, called 'Collections'. Many collections can be created and managed at one time. In some implementations, collections can also be laid out in a hierarchical fashion, much in the same way that an operating system's directory structure works.

All XML databases now support at least one form of querying syntax. Minimally, just about all of them support XPath for performing queries against documents or collections of documents. XPath is a simple pathing system that allows you to identify nodes that match a particular set of criteria.

In addition to XPath, many XML databases support XSLT as a method of transforming documents or query results that are being retrieved from the database. XSLT is a declarative language written using an XML grammar. It's purpose is to define a set of XPath filters that will be used to transform documents in part or in whole into other formats including Text, XML, HTML, or PDF.

Eventually, most XML databases will support XQuery to perform querying. XQuery includes XPath as a node selection method, but extends XPath to provide transformational scaffolding. It's syntax is sometimes referred to as FLWR (pronounced 'Flower') because the flow may include the following statements: 'For', 'Let', 'Where' and 'Return'

Some XML databases support an API called the XML:DB API (or XAPI) as a form of implementation-independent access to the XML datastore. In XML databases, XAPI is analogous to ODBC for relational databases.