Skip to main content

Extrange error with xerces

1 reply [Last post]
falopio
Offline
Joined: 2010-05-10
Points: 0

Hi all:

First of all, im sorry by my poor english, im trying to improbe it, be patient.

I've a problem with apache xerces that appears to be 'usual'....

I've an inherit application that parses a XML to do some operations with received data -aka web service-....

Well, to use the xerces library to read a XML you must implement a class called DefaultHandler in which you have a method called character that its called when parsing gets a text inside a element.

That method does nothing by default (you must override it to do whatever you want to do) so in my apliccation it was implemented in this way:

public void characters(char buf[], int offset, int len) throws SAXException {
String s = new String(buf, offset, len);
tempValue = s.trim();
}

In production enviroment we've a problem:

When parsing a big XML, reached to certain moment, when it have to read a date (01/01/2001) it reads 01 and after it (in the next notification/call) /01/2001

Debbuging I've seen that te problem is that buf is a buf[2048] and in that point offset is 2046 so the maximum available length is 2.... In the next call offset is 0 and length 8 and it reads the remaining data of the date...

So I've data that sould be readed and treated once readed and treated (badly, because the data is corrupted when is divided) twice....

Since that method is called by:

org.apache.xerces.parsers.AbstractSAXParser.characters(AbstractSAXParser.java:483)
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(XMLDocumentFragmentScannerImpl.java:911)
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:1477)
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:348)
org.apache.xerces.parsers.DTDConfiguration.parse(DTDConfiguration.java:539)
org.apache.xerces.parsers.DTDConfiguration.parse(DTDConfiguration.java:595)
org.apache.xerces.parsers.XMLParser.parse(XMLParser.java:152)
org.apache.xerces.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1125)
javax.xml.parsers.SAXParser.parse(SAXParser.java:345)

I don't know what to do to solve this problem changing the minimum amount of code :(

Any suggestion?

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
falopio
Offline
Joined: 2010-05-10
Points: 0

OMG!!!!!!

It's not a bug! its a feature!

Well, it appears that when characters is executed it only means that a text element is been reading... it can be a full element or a part of it if the buffer is about to be filled...

'It's programmer responsability' to see if the element is full readed or not....

So you must do a buffer in the characters method to wrap the text...

I don't understand why it's programmer's responsability to do this.

There must be a good reason but i cannot see it....

Message was edited by: falopio