Java Mailing List Archive

Home » JDOM Projects »

[jdom-interest] [xml-dev] Cannot close an XML file used for parsing

Jack Bush


Author LoginPost Reply
Hi Everyone,


I have added the additional I/O statements in the finally clause as follows but the problem still persisted:



// reading data (html) from the webpage and save it in html format.

try {



catch { …. }

finally {









// convert the html webpage format to xml format

try {



catch { …. }

finally {








Below is a short listing of the new XML file:


  <?xml version="1.0" encoding="iso-8859-1" ?>

- <<html>

- <<head>

  <<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

  <<meta name="keywords" content="California, cities, towns, villages, list, zipcodes, postal codes, united states, ca" />

  <<meta name="description" content="Cities, towns and suburbs in California, United States (CA) starting with A" />

  <<title>Cities and Towns in California starting with A – ABC Company</title>

  <<link rel="stylesheet" href="" type="text/css" media="screen" />


- <<body>

  <<a name="top" />

- <<div id="container">

- <<div id="header">

  <<div id="postmark" />

- <<a href="" class="imglink">

  <<img id="logoimg" src="" width="192" height="33" alt="Zipcodes America Logo" />


  <<hr />


- <<div id="nav">

- <<ul>

- <<li>

  <<a href="" title="Home Page">Home</a>


- <<li>


  (zipcode or suburb)

- <<div class="hide">

  <<form method="post" action="" />        // line 23


  <<input type="text" name="q" class="searchbox" alt="Search query" />

  <<br />

  <<input type="submit" value="find!" class="searchbutton" alt="Perform search" />

  <<div class="hide" />



What I find it interesting is that it is possible to parse the above XML file with the same parseData() from another class without any problem. As a result, I have come to the following conclusion so far:

( i ) There is some file locking that is prevent saxBuilder from parsing the XML file at the time.

( ii ) The light_html2xml does not appears to have correctly converted over the orginal Html to Xml but some how it has been picked up by the parser in the same class, but not by the same parser from another class.

( iii ) I would like to use another conversion tool such as Tagsoup in place of light_html2xml to determine where the cause of this issue is coming from. As a result, would anyone be able to assist me coming up with a few lines of conversion statements using Tagsoup since I am not familiar with using this tool?

( iv ) light_html2xml is good as it strip out all namespace, DTD, Entity Resolver, etc and only return what I need. JTidy does correct conversion but include namespace, DTD, Entity Resolver which makes parsing difficulty.


Many thanks again,



From: Sheila M. Morrissey <>
To: Jack Bush <>
Sent: Wednesday, 29 October, 2008 12:52:06 AM
Subject: RE: [xml-dev] Cannot close an XML file used for parsing

Jack – did you try fosOutHtml.getFD().sync() after the flush?




From: Jack Bush []
Sent: Tuesday, October 28, 2008 8:41 AM
To: Robert Koberg
Subject: Re: [xml-dev] Cannot close an XML file used for parsing


Hi Robert,


Thanks for responding to this post.


I have added your suggestion but the issue still persist. Nevertheless, I do believe that this is caused by the new XML file not having been closed properly.


There is no problem with light-html2xml method which has worked in the past.


Any more suggestion to try out?






From: Robert Koberg <>
To: Jack Bush <>
Sent: Tuesday, 28 October, 2008 9:42:21 AM
Subject: Re: [xml-dev] Cannot close an XML file used for parsing

close the stream or reader in a finally block to avoid leaving it open 
if an error occurs.

}finally {

On Oct 27, 2008, at 6:03 PM, Jack Bush wrote:

> Hi All,
> I appears to have difficulty closing (possibly flushing it first) an 
> XML file that was subsequently being parsed without success. The 
> error generated is:
> org.jdom.input.JDOMParseException: Error on line 23: The element 
> type "form" must be terminated by the matching end-tag "</form>".
> Below is the code snippets of readData() to retrieve (HTML) data 
> from a website, save it to a file, then convert to XML format before 
> returning the new filename:
> public String readData() {
>    try {
>          URL url  = new URL("");
>          URLConnection connection = url.openConnection();
>          InputStream isInHtml = url.openStream();  // throws an 
> IOException
>          disInHtml = new DataInputStream(new 
> BufferedInputStream(isInHtml));
>          System.out.flush();
>          FileOutputStream fosOutHtml = null;
>          fosOutHtml = new FileOutputStream("C:\\Temp\\ABC.html");
>          int oneChar, count=0;
>          while (( != -1)
>              fosOutHtml.write(oneChar);
>          isInHtml.close();
>          disInHtml.close();
>          fosOutHtml.flush();    // optional
>          fosOutHtml..close();
>          .....
>    }
>    try {
>          File fileInHtml = new File("C:\\Temp\\ABC.html");
>          FileReader frInHtml = new FileReader(fileInHtml);
>          BufferedReader brInHtml = new BufferedReader(frInHtml);
>          String string = "";
>          while (brInHtml.ready())
>              string += brInHtml.readLine() + "\n";
>          fwOutXml  = new FileWriter("C:\\Temp\\ABC.xml");
>          pwOutXml  = new PrintWriter(fwOutXml);
>          light_html2xml html2xml = new light_html2xml();
>          pwOutXml.print(html2xml.Html2Xml(string));
>          system.out.flush()    // optional
>          fwOutXml.flush();      // optional
>          fwOutXml.close();
>          pwOutXml.flush();      // optional
>          pwOutXml.close();
>          return fileInHtml.getAbsolutePath();
>          ....
>    }
> }
> // parseData reads the XML file using the name returned by readData()
> public void parseData(String XMLFilename)
> {
>    try
>    {
>        FileReader frInXml = new FileReader(FileName);
>        BufferedReader brInXml = new BufferedReader(frInXml);
>        SAXBuilder saxBuilder = new 
> SAXBuilder("org.apache.xerces.parsers.SAXParser"); // 
> JDOMParseException generated.
>        ....
> }
> These codes would worked when they were in a single method but I 
> have since placed some structure around them using a number methods.
> This issue has risen in th past where I have been able to close the 
> XML file prior to reading them again. However, I don't have a 
> solution for it this time round.
> I am running JDK 1.6.0_10, Netbeans 6.1, JDOM 1.1 on Windows XP 
> platform.
> Any assistance would be appreciated..
> Many thanks,
> Jack

Search 1000's of available singles in your area at the new Yahoo!7 Dating. Get Started.
To control your jdom-interest membership:
©2008 - Jax Systems, LLC, U.S.A.