Posted by simongbrown
on March 3, 2004 at 2:18 PM PST
I've been having lots of "fun" over the past days trying to figure out how to get JSP pages to properly display international characters ... now it seems to be working.
I've been having lots of "fun" over the past days trying to figure out how to get JSP pages to properly display international characters. I've tried HTTP meta tags, JSP page encodings and seemed to be getting nowhere. If I have understood all the reading that I've done, then there are a couple of things that you should do to tell the web browser that you wish to display international (e.g. Japanese) characters.
- Specify the content type and character set from within your JSP.
<%@ page contentType="text/html; charset=UTF-8" %>
- Use a HTTP meta tag as a hint to the browser (I don't think this is essential, but it all helps).
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
After trying this and seeing that it worked for most people, I was kind of confused to see that my pages still displayed international characters as junk. I checked and double-checked all my headers, flushed my browser caches and even tried it on different browsers (IE and Safari on Mac). Still no joy. In fact, looking at the character encoding of the page under IE revealed that the encoding was still Latin1.
After scanning around for anything else that looked remotely locale oriented, I realised that I was using the JSTL
<fmt:setLocale> tag to set a default locale to be used within the
<fmt:formatDate> tags. Changing the value of the locale passed through to this would change the actual character encoding of the web page. However, still the characters showed as junk, albeit different junk!
A quick scan through the JSTL specification for the tag revealed the answer (or at least what seems to be the answer).
As a result of using this action, browser-based locale setting capabilities are disabled.
I downloaded the code for the JSTL tags and using this tag does in fact set the locale of the response, which appears to take precedence over the above charset settings. Commenting this tag out fixed all the problems. Except one ... now my dates were all formatted according to the default locale of the JVM and the JSTL
<fmt:formatDate> tag doesn't allow you to specify a locale purely for formatting purposes. Thankfully, you can set a default locale to be used in the formatting actions with the following code that uses the
Config.set(request, Config.FMT_LOCALE, someLocale);
Now there was just one last thing - submitting information via a HTML form. Most browsers don't appear to send back a charset in the request that corresponds to the encoding that was used to format the page. In this case, the request character encoding defaults to ISO-8859-1 meaning that there's potentially a mismatch between form data being sent (in UTF-8) and information retrieved from the request (in ISO-8859-1) using the
getParameter() method on the
HttpServletRequest class. To fix this, all you need to do is explicitly set the character encoding of the request before accessing data.
Is this the total solution to displaying international characters in JSP? I hope so but I need to test this on other platforms and JSP containers. Hopefully I will read this blog entry next week and everything will still be correct.