Convert html to xml

Joined: 2008-05-17

Hi all,
I have got a problem, i need to find some information in html text. I need to have a possibility to define in which nodes this text exists. I was thinking, to do this somethink like that: first to convert html to xml, and then parse text in xml (for exaple with XPath), but i have encountered following problem: in xml there could be some entities, (like or antythink else :) ), and parser dosn't recognize its definitions. How to resolve this problem, or maybe i should not try to convert html to xml, and just parse html, with avalaible libraries. Thanks for any suggestions.

Joined: 2008-07-04

I have written a Java function to convert HTML to well-formed XML so XPath queries can be used to get information inside HTML pages. This is an opensource project :