Skip to main content

Problems with kxml2 and html entities

No replies
arruda82
Offline
Joined: 2009-07-20

I'm developing an app that reads texts from a remote xml file using kxml2,
which has html entities such as "& # 2 3 3 ;" (without the spaces) for "é".

The problem is that the entities are being broken, even though i have
manually set the UTF-8 encoding (the same as the XML being read):

KXmlParser parser = new KXmlParser();
parser.setInput(data,"utf-8");
Document dom = new Document();
dom.setEncoding("utf-8");
dom.parse(parser);

The parsed text is truncated where the first entity is found, for example:

XML Text: Andr& # 2 3 3 ; (without the spaces) Arruda
Expected text parsed: André Arruda
Resulting text: Andr

I've tested the output in System.out, 3 different emulators and the device itself,
all presented the same issue.

I can't change the XML contents (not my site) and the file has all the required
XML headers.

Any ideas?

Message was edited by: arruda82

Message was edited by: arruda82