Compression of XML?
I am not sure if this was exactly the intent of FI, but I have been tasked with using it to compress XML in order to send large docs over a low bandwidth connection. Doing so I have noticed some issues and would like to find out what I am doing wrong. Any assistance is greatly appreciated.
First, I do NOT get the same XML Doc when I create a FI Doc (using the SAXDocumentSerializer) and then recreate the XML doc from the generated FI Doc. My original XML doc starts as follows:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
After compressing and recreating the XML Doc, I get:
<?xml version="1.0" encoding="UTF-8"?>
Second, I have an XML doc that is formatted in a friendly human readable format (i.e. CR/LF and indenting included). How do I get the whitespace to be compressed?
Third, The CR/LF get removed by the way I am reading the XML file. But if I include the CR/LF when I read in the XML file, I get an OutOfMemoryError: Java heap space. This error is NOT received when the CR/LF are removed. The XML file is 2M and has almost 73,000 lines. Is there a way to correct this without increasing the heap space?
Fourth, Would specifying the schema aid in the compression of the XML doc? If so, what is the best way for doing this?