BDE 4.14.0 Production release
Loading...
Searching...
No Matches
balxml_validatingreader

Detailed Description

Outline

Purpose

Provide a common reader protocol for parsing and validating XML.

Classes

See also
balxml_reader

Description

This component represents an abstract class balxml::ValidatingReader - an XML reader that provides data validation against DTD or/and XML Schemas(XSD). The balxml::ValidatingReader inherits from the balxml::Reader interface and therefore fully compliant with it. In addition, balxml::ValidatingReader provides additional methods to control the validation. The enableValidation method specifies what type of validation the reader should perform. Setting validationFlag to false produces a non-validating reader. Setting it to true forces the reader perform the validation of input XML data against XSD schemas.

Schema Location and obtaining Schemas

In validating mode the reader should be able obtain external XSD schemas. balxml::ValidatingReader requires that all schema sources must be represented in the form of bsl::streambuf objects. According to W3C standard an information about external XSD schemas can be defined in three places:

Example:

<purchaseReport
xmlns="http://www.example.com/Report"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.example.com/Report
http://www.example.com/Report.xsd"
period="P3M" periodEnding="1999-12-31">

For all mentioned cases, having the URI reference which identifies a schema and an optional namespace, the processor(parser) should obtain bsl::streambuf object for the schema. For this purpose balxml::ValidatingReader interface defines the two level schemas resolution process:

  1. The reader(parser) must lookup schema in internal cache. If the schema is found, it must be used.
  2. Otherwise reader must use the associated resolver to obtain schema (see balxml::Reader::XmlResolverFunctor).

Both the schema cache and resolver should be setup before the method open is called.

Schema Cache

balxml::ValidatingReader provides two abstract methods to maintain the schema cache:

Thread Safety

This component does not provide any functions that present a thread safety issue, since the balxml::Reader class is abstract and cannot be instantiated. There is no guarantee that any specific derived class will provide a thread-safe implementation.

Usage

This section illustrates intended use of this component.

Example 1: Basic Usage

In this example, we will create a validating parser that parses and validates document again the schema.

#include <a_xercesc_reader.h>
#include <iostream>
#include <sstream>

The following string describes an XSD schema for the documents we are going to parse:

const char TEST_XSD_STRING[] =
"<?xml version='1.0' encoding='UTF-8'?>"
"<xsd:schema xmlns:xsd='http://www.w3.org/2001/XMLSchema'"
" xmlns='http://bloomberg.com/schemas/directory'"
" targetNamespace='http://bloomberg.com/schemas/directory'"
" elementFormDefault='qualified'"
" attributeFormDefault='qualified' >"
" "
"<xsd:complexType name='entryType'>"
" <xsd:sequence>"
" <xsd:element name='name' type='xsd:string'/>"
" <xsd:element name='phone'>"
" <xsd:complexType>"
" <xsd:simpleContent>"
" <xsd:extension base='xsd:string'>"
" <xsd:attribute name='phonetype' type='xsd:string'/>"
" </xsd:extension>"
" </xsd:simpleContent>"
" </xsd:complexType>"
" </xsd:element>"
" <xsd:element name='address' type='xsd:string'/>"
" </xsd:sequence>"
"</xsd:complexType>"
" "
"<xsd:element name='directory-entry' type='entryType'/>"
"</xsd:schema>";

The following string describes correct XML for a conforming schema. The top-level element contains one XML namespace attribute, with one embedded entry describing a user:

const char TEST_GOOD_XML_STRING[] =
"<?xml version='1.0' encoding='UTF-8'?>\n"
"<directory-entry xmlns:dir='http://bloomberg.com/schemas/directory'\n"
" xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'\n"
" xsi:schemaLocation='http://bloomberg.com/schemas/directory \n"
" aaa.xsd' >\n"
" <name>John Smith</name>\n"
" <phone dir:phonetype='cell'>212-318-2000</phone>\n"
" <address/>\n"
"</directory-entry>\n";

The following string describes invalid XML. More specifically, the XML document is well-formed, but does not conform to our schema:

const char TEST_BAD_XML_STRING[] =
"<?xml version='1.0' encoding='UTF-8'?>\n"
"<directory-entry xmlns:dir='http://bloomberg.com/schemas/directory'\n"
" xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'\n"
" xsi:schemaLocation='http://bloomberg.com/schemas/directory \n"
" aaa.xsd' >\n"
" <name>John Smith</name>\n"
" <phone dir:phonetype='cell'>212-318-2000</phone>\n"
"</directory-entry>\n";

Now we define a parse method for parsing an XML document and validating against an XSD schema:

int parse(balxml::ValidatingReader *reader,
const char *xmlData,
const char *xsdSchema)
{
Definition balxml_validatingreader.h:321

In order to read the XML, we first need to construct a balxml::NamespaceRegistry object, a balxml::PrefixStack object, and a TestReader object, where TestReader is a derived implementation of balxml_validatingreader .

balxml::PrefixStack prefixStack(&namespaces);
ASSERT(!reader->isOpen());
Definition balxml_namespaceregistry.h:181
Definition balxml_prefixstack.h:137
virtual bool isOpen() const =0

The reader uses a balxml::PrefixStack to manage namespace prefixes so we need to set it before we call open.

reader->setPrefixStack(&prefixStack);
ASSERT(reader->prefixStack() == &prefixStack);
virtual void setPrefixStack(PrefixStack *prefixes)=0
virtual PrefixStack * prefixStack() const =0

Setup validation

reader->removeSchemas();
reader->enableValidation(true);
ASSERT(reader->validationFlag());
bsl::istringstream schemaStream(xsdSchema);
reader->addSchema("aaa.xsd", schemaStream.rdbuf());
virtual void removeSchemas()=0
virtual int addSchema(const char *location, bsl::streambuf *schema)=0
virtual void enableValidation(bool validationFlag)=0
virtual bool validationFlag() const =0
Return true if the reader has validation turned on false otherwise.
Definition bslstl_istringstream.h:176

Now we call the open method to setup the reader for parsing using the data contained in the in the XML string.

int rc = reader->open(xmlData, bsl::strlen(xmlData), 0, "UTF-8");
ASSERT(rc == 0);
virtual int open(const char *filename, const char *encoding=0)=0

Confirm that the bdem::Reader has opened properly

ASSERT(reader->isOpen());

Do actual document reading

while(1) {
rc = reader->advanceToNextNode ();
if (rc != 0) {
break;
}
virtual int advanceToNextNode()=0

process current node here

}

Cleanup and close the reader.

reader->close();
ASSERT(!reader->isOpen());
reader->setPrefixStack(0);
ASSERT(reader->prefixStack() == 0);
return rc;
}
virtual void close()=0

The main program parses an XML string using the TestReader

int usageExample()
{
a_xercesc::Reader reader;
int rc = parse(&reader, TEST_GOOD_XML_STRING, TEST_XSD_STRING);

Normal end of data

ASSERT(rc==1);
int rc = parse(&reader, TEST_BAD_XML_STRING, TEST_XSD_STRING);

Parser error - document validation failed

ASSERT(rc==-1);
return 0;
}