Skip to main content

Reading, Storing and Displaying DBCS

No replies
withnail73
Offline
Joined: 2012-02-06
Points: 0

Hi there. I'm having some trouble with some Korean/Chinese/Japanese text in a CSV file which I need to store in XML and then extract from the XML and display correctly in a Lotus Notes document.

The file (let's use the Korean one) was generated via an Excel template and is encoded in Unicode - if I view it in NotePad++ it shows it as UCS-2 Little Endian.

For a separate project where I needed to pull Korean from an XLS file I had to use this to pull it from the file and place in XML:

return new String(dbStr.getBytes("EUC-KR"), "ISO-8859-1");

And then this to pull from the XML and dump in Notes:

return new String(dbStr.getBytes("ISO-8859-1"), "EUC-KR");

Despite being told that I had to use UTF-8 instead of ISO, this worked and could be viewed correctly here in the UK and in Singapore. Alas, this code will not work for the CSV file.

The closest I can get is by using: return new String(dbStr.getBytes("UTF-16"), "ISO-8859-1"); But when I convert back only the last 4 chars are correct. Any advice would be much appreciated.

Thanks.