Bhavik Patel
Bhavik Patel

Reputation: 1074

java VTD-Parser Logic

I implemented VTD-Parser in java which successfully parsed XML file of around 500 MB easily and was able to write in an excel. I understand that DOM parser first created a tree node structure and then get the data, and SAX is an event based parser. But what makes VTD parse the file so easy and efficient? I tried to search, I got many examples of implementation but never got the logic .I tried the below link to get the idea but did not get the clear picture. VTD_Parser

If any one can explain the brief idea.

Upvotes: 1

Views: 164

Answers (1)

Sharon Ben Asher
Sharon Ben Asher

Reputation: 14383

According to the Wikipedia page on the subject, VTD-XML (Virtual Token Descriptor for XML) utilizes non exctractive parsing, meaning it does not extract the data out of the document into some memory based data-structure, but rather builds a data structure that contains pointers (in the form of offset and length) to the original document. This processing is clearly the most memory efficient, but I believe it comes at the cost of performance since the inevitable IO operation is done when the data is requested (but caching can help a lot here).

It seems to me that tis processing is most usefull when the input is very big and the requested data is very small (kind of data mining scenario)

Upvotes: 1

Related Questions