Posted by fabriziogiudici
on June 23, 2010 at 5:29 PM PDT
blueBill Mobile reached more than 200 downloads, with more than 100 active installs. I'm getting a few positive feedbacks and - more important - I didn't receive more problem reports, so I presume that the application is reasonably stable (even though communicating with users is still a problem). Time to move on the next set of features...
Dealing with media crosses the border with blueMarine and blueOcean code, but it's not what I'm going to talk about now. The problem I faced was with the managing of the catalogs. xeno-canto only exposes HTML pages, thus I needed to do some HTML scraping to extract the data. There's always some new magic when you deal with a new web site that is not XHTML compatible, but in the end everything is doable: with a simple FilterStream and JTidy the pages can be converted into XML and then processed with XPath.
Here comes the problem. Google didn't provide XPath support until the latest 2.2 - which is totally useless for me, as I'm targeting 1.5. I did some tries with various combinations of JAXP, Jaxen and Dom4J, but every combination has got a problem (triggering a specific JAXP but in Android when tests were fine with JSE, or directly failing in JSE). Maybe it's because of my inexperience with Jaxen and Dom4J, but some XPath expression that works in JAXP and other stuff didn't with Jaxen. I soon got bored of that, thinking that it's incredible that with a modern mobile platform in 2010 you can have problems if you try to handle XML at high level.
In the end, I resorted for abandoning XML. With a few hours of TDD exercising, I've prototype a simple marshaller/unmarshaller for RDF+JSON, that is serializing RDF triples over JSON . At least for my current needs, it wasn't hard - probably less than fixing the XML troubles, for sure funnier. In this way, I anticipated the use of RDF, that was initially planned for a further iteration. Basically, I'm pre-processing the xeno-canto pages, converting them in RDF+JSON and storing a separate data file for each bird species. In future, I'd like to have a real RDF store running on Android, even though I fear that it could be a hard task, given the troubles Android shows with XML. A further step will probably be to have blueBill Server to provide the translated RDF+JSON on-the-fly (it could do that right now, but I don't know how scalable is and how reliable my provider is, so I'll need lots of testing first).
*** FOLLOW UP
pointed me to the fact that parsing incosistencies among different parsers were related to the management of namespaces. Actually, I've found a document that describes the different scenarios
. Once everything about namespaces has been properly configured, I achieved consistent parsing on JSE with different combinations of JAXP, Dom4J and Jaxen. Thus, the pair Dom4+Jaxen could be ok for the use on Android, leaving out buggy or incomplete JAXP. Since I'm going the RDF+JSON way I've not actually tested it on Android, though.