Skip to main content

b46 XML: CDATA sections are parsed incorrectly

5 replies [Last post]
uncle_alice
Offline
Joined: 2003-06-16
Points: 0

When loading an XML file containing the fragment[pre] [/pre]
I now get the error shown below. I don't know precisely when this behavior changed, but I know it worked right under b40. It looks like the parser is treating the first instance of "]]" as the end of the CDATA section, then getting out of sync because the next character isn't ">". I tested this by changing the third bracket to something else, like so:[pre] [/pre]and I got the exact same error. I'll submit a bug report if necessary, but I'm not sure which category I should select: SDK, JAXP, or something else. Anyone know?

Here's the relevant portion of the stack trace:

org.xml.sax.SAXParseException: The element type "brackets" must be terminated by the matching end-tag "".
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException( ErrorHandlerWrapper.java:236)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError( ErrorHandlerWrapper.java:215)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError( XMLErrorReporter.java:388)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError( XMLScanner.java:1419)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement( XMLDocumentFragmentScannerImpl.java:1763)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next( XMLDocumentFragmentScannerImpl.java:2944)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next( XMLDocumentScannerImpl.java:664)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next( XMLNSDocumentScannerImpl.java:158)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument( XMLDocumentFragmentScannerImpl.java:524)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse( XML11Configuration.java:844)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse( XML11Configuration.java:774)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse( XMLParser.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse( AbstractSAXParser.java:1255)

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
timbell
Offline
Joined: 2003-06-10
Points: 0

This is now Bug-ID 6313289

You can monitor this bug on the Java Bug Database at http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6313289.

It may take a day or two before your bug shows up in this external database.

If you are a member of the Sun Developer Network (SDN), consider logging in and clicking "Watch this bug" on the left side of the report page. Adding this report to your Bug Watch list means that you will receive an email notification when this bug is updated. The SDN membership also enables voting for this bug.

The Sun Developer Network (http://developers.sun.com) is a free service that Sun offers. To join, visit https://softwarereg.sun.com/registration/developer/en_US/new_user.

Regards,
Tim

bhaktimehta
Offline
Joined: 2004-03-18
Points: 0

Yes please can you file a bug under JAXP. I will also forward your thread to the JAXP team to look into this issue.

Thanks,
Bhakti

uncle_alice
Offline
Joined: 2003-06-16
Points: 0

Thanks, Bhakti. I'm working on the report now.

-Alan

uncle_alice
Offline
Joined: 2003-06-16
Points: 0

Okay, the review ID is 509206.

My theory about the cause of the bug was almost exactly the opposite of the reality. It seems they used to scan for the next occurrence of "]]", then check separately for the following ">", with some hackery to deal with intervening ']' characters. Then they switched to looking for the full "]]>" delimiter, but there's a bug in the method that does that. The method works fine when looking for a two-character delimiter, but it's unreliable with anything longer.

BTW, what version of Xerces is included now? When I try to find out by running [pre]>java com.sun.org.apache.xerces.internal.impl.Version[/pre]it says 2.6.2, but that can't be right, because it says the same thing under JDK 1.5.

uncle_alice
Offline
Joined: 2003-06-16
Points: 0

I did some more research, and it looks like the Xerces version [i]is[/i] still 2.6.2, with some changes (including this one) added by the JAXP team. I wasted a fair amount of time looking through changelogs and source code at the Xerces website before I figured that out. JAXP is frustratingly opaque compared the the JDK as a whole; it doesn't even have a proper JSR associated with it. Are there any plans to (for example) create a JAXP project here at java.net?