Java Mailing List Archive

http://www.junlu.com/

Home » Home (12/2007) » JDOM User »

[jdom-interest] skipping a huge text node

Tobias Thierer

2006-06-20

Replies:

Hi,

I am trying to parse a very large XML document, 99% of which consists of one
huge text node:

<sequence>ACGGAAAT[...]</sequence>

which is too large to fit into memory. So instead of getting the whole
String returned by the parser (which won't work because it doesn't fit into
memory), I'd like to get just the length of the string and its offset in the
XML file, so that whenever I want to access parts of the sequence, I can
seek to the correct position and read just the substring that I am
interested in.

Is it somehow possible to tell jdom to consume the text node and reporting
its offset in the file and its length, rather than storing it in memory?

I've looked at jdom-contrib which provides an ElementListener interface, but
that one's elementMatched() method is only called *after* the element
(including the close tag) has been fully read. All the classes like
SAXBuilder etc. only seem to handle events that come from the parser, but
what I want to do is change the events that the parser reports.

Is there any chance to do this with jdom(-contrib)? If not, do you know of
any other XML parser with which I could do that?

Cheers,

Tobias

_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@(protected)
©2008 junlu.com - Jax Systems, LLC, U.S.A.