Project Homepage of VTD-XML Logo

Sourceforge Home

Mailing Lists





5. Conclusion

In this paper we introduced the concept of Virtual Token Descriptor and location cache, both of which are designed to enable a “non-extractive” XML processing model. We also provided a detailed description of the processing model and showed how to navigate the element hierarchy as represented by the combination of VTD tokens and location cache. Attempting to achieve most of DOM’s functionality without incurring its resource overhead, the processing model makes extensive use of 64-bit integers in order to avoid per-object overhead associated with most object-based hierarchies. The benchmark results suggest that we have met most of our design goals. However, we would like to acknowledge the "work-in-progress" status of the work presented in the paper. There are also some limitations of our processing model worth mentioning. First, because VTD makes use of 64-bit integers and fixed-sized fields to encode offset values, for documents that are very large (>1G) or deep, one might need to move bits around, or even add another 32 bits to a VTD record, to meet the actual processing requirement. Second, the current implementation does not resolve entities outside of those five built-in ones (&amps; < > ' "). In addition, our reference implementation doesn't support either DTD or Schema validation. Last, the maximum supported array size in Java is 2G, which is the maximum size that the processing model can handle. As a workaround, we might need to use chunk-based byte buffers to overcome this limit.

VTD in 30 seconds

VTD+XML Format

User's Guide

Developer's Guide

VTD: A Technical Perspective

  0. Abstract

  1. Introduction

  2. A Processing Model Based on VTD

  3. Navigate XML

  4. A Closer Look

  5. Conclusion


Code Samples


Getting Involved

Articles and Presentations