Google

Jul 29, 2014

Java and XML tutorial with StAX parser - reading

Q. What is a StAX Parser and when will you use it?
A. The StAX Java API for XML processing is designed for parsing XML streams, just like the SAX API's, but
  • StAX is a "pull" API whereas SAX is a "push" API.
  • StAX can do both XML reading and writing whereas SAX can only do XML reading.

Q. Why use StAX when there is DOM?
A. The DOM parsing involves creating in-memory objects representing an entire document tree for an XML document. Once in memory, DOM trees can be navigated and parsed arbitrarily, providing the maximum flexibility for developers. However this flexibility comes at a cost of large memory footprint and significant processor requirements, as the entire representation of the document must be held in memory as objects for the duration of the document processing. This may not be an issue when working with small documents, but memory and processor requirements can escalate quickly for larger size documents.

StAX (Streaming Api for XML) involves streaming where streaming refers to a programming model where XML data sets are transmitted and parsed serially at application runtime as events like StartElementEvent, CharacterEvent, EndElementEvent, etc, often in real time and dynamically where contents are not precisely known beforehand. These stream-based parsers can start generating output immediately, and data set elements can be discarded and garbage collected immediately after they are used. This ensures smaller memory footprint, reduced processor requirements, and higher performance in certain scenarios. This memory and processing efficiency comes at the cost where streaming can only see the  data set state at one location at a time in the document.

Here is how the events are sequentially processed:


Q. What is the major advantage of StAX over SAX?
A. The major advantage of StAX over SAX is that the pull model allows sub parsing of the XML input. You can extract out the element name, then the attributes, and then the characters (i.e. content)


Here is a simple code example to read the Employee from XML:


package com.xml;

import java.io.StringReader;

import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;

public class StaxProcessing {

 public static void main(String[] args) {
  XMLInputFactory factory = XMLInputFactory.newInstance();
  String xml = "<Employee><name type=\"first\">Peter</name><age>25</age></Employee>";

  XMLEventReader eventReader;

  try {
   eventReader = factory.createXMLEventReader(new StringReader(xml));
   Employee emp = null;

   while (eventReader.hasNext()) {
    // read node
    XMLEvent event = eventReader.nextEvent();

    if (event.isStartElement()) {
     StartElement startElement = event.asStartElement();
     // root node
     if (startElement.getName().getLocalPart().equalsIgnoreCase("Employee")) {
      System.out.println(startElement.getName().getLocalPart());
      System.out.println("Before adding a new node ----------------");
      emp = new Employee();
     }
     if (startElement.getName().getLocalPart().equalsIgnoreCase("name")) {
      // get attribut values
      Attribute attr = startElement.getAttributeByName(new QName("type"));
      if(attr != null && attr.getName().toString().equals("type")) {
          emp.setType(attr.getValue());
      }
      // element data
      event = eventReader.nextEvent();
      emp.setName(event.asCharacters().getData());

     } else if (startElement.getName().getLocalPart().equalsIgnoreCase("age")) {
      // element data
      event = eventReader.nextEvent();
      emp.setAge(new Integer(event.asCharacters().getData()));
     }

    }

    if (event.isEndElement()) {
     EndElement endElement = event.asEndElement();
     if (endElement.getName().getLocalPart().equalsIgnoreCase("Employee")) {
      System.out.println(emp);
     }
    }

   }

  } catch (XMLStreamException e) {
   // TODO Auto-generated catch block
   e.printStackTrace();
  }

 }
}


package com.xml;

import java.math.BigDecimal;

public class Employee {
 private String name;
 private String type;
 private int age;
 private BigDecimal salary;
 public String getName() {
  return name;
 }
 public void setName(String name) {
  this.name = name;
 }
 public String getType() {
  return type;
 }
 public void setType(String type) {
  this.type = type;
 }
 public int getAge() {
  return age;
 }
 public void setAge(int age) {
  this.age = age;
 }
 public BigDecimal getSalary() {
  return salary;
 }
 public void setSalary(BigDecimal salary) {
  this.salary = salary;
 }
 
 @Override
 public String toString() {
  StringBuilder sb = new StringBuilder();
  sb.append("name=" + name);
  sb.append("\n");
  if(type != null) {
   sb.append("type=" + type + "\n");
  }
  sb.append("age=" + age);
  return sb.toString();
 }
}



Output

Employee
Before adding a new node ----------------
name=Peter
type=first
age=25



In the next post will cover writing employee object back to XML.

You have two approaches: cursor based and iterator based.

Labels:

1 Comments:

Anonymous Anonymous said...

Thanks for this stax related information

7:07 PM, August 05, 2014  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home