Books / Understanding XML / Chapter 5

XML Parsers

The parsers define standard APIs to access and manipulate the parsed XML data. The two most popular parser APIs are DOM (Document Object Model) based and SAX (Simple API for XML).

SAX and DOM offer complementary paradigms to access the data contained in XML documents. DOM allows random access to any part of a parsed XML document. To use DOM APIs, the parsed objects must be stored in the working memory. Conversely, SAX provides no storage and presents the data as a linear stream. With SAX, if you want to refer back to anything seen earlier you have to implement the underlying mechanism yourself. For example, with DOM an application program can import an XML document, modify it in arbitrary order, and write back any time. With SAX, you cannot perform the editing arbitrarily since there is no stored document to edit. You would have to edit it by filtering the stream, as it flows, and write back immediately.

DOM

DOM is described in detail on our DOM tutorial page.

Event-Oriented Paradigm: SAX

SAX (Simple API for XML) is a simple, event-based API for XML parsers. The benefit of an event-based API is that it does not require the creation and maintenance of an internal representation of the parsed XML document. This makes possible parsing XML documents that are much larger than the available system memory would allow, which is particularly important for small terminals, such as PDAs and mobile phones. Because it does not require storage behind its API, SAX is complementary to DOM.

SAX parser Java example.

SAX parser Java example

SAX provides events for the following structural information for XML documents:

  • The start and end of the document
  • Document type declaration (DTD)
  • The start and end of elements
  • Attributes of each element
  • Character data
  • Unparsed entity declarations
  • Notation declarations
  • Processing instructions

Licenses and Attributions


Speak Your Mind

-->