The default character encoding for SiteMesh is ISO-8859-1. Also, SiteMesh assumes that the underlying servlet container encoding is also configured to ISO-8859-1.
ISO-8859-1 is generally intended for "Western European" languages. The Wikipedia contains a list of languages with complete coverage.
To support an international market, you must switch to the new standard UTF-8.
The process of using UTF-8 with SiteMesh requires adjustments to the following layers,
- Set Files and Workspace to UTF-8
- HTML
- JSP
- SiteMesh
It may also require changes to the following,
- Servlet Container
- Server Operating System
- Server Database if one is being used
Following this tutorial it will be clearly evident if UTF-8 is working. As such, we will stop at each layer to see the results of our changes. It may be the case that the other layers already default to UTF-8.
Create a UTF-8 Test Pages
The very first step is to create a UTF-8 page and identify if there are issues.
Verify the Encoding
Load page in Firefox 2.x or higher and click View, Character Encoding. The dot in the menu will indicate which encoding is being used by Firefox.
Adjust for SiteMesh
HTML Code
Inform the browser that the page contents are of a specific character set. You should do this anyway as a general practice.
This is done by specifying a meta tag in the HEAD element of the html page,
Test to see if this resolved the issue. If not continue to the next step.
JSP Header
Set the JSP response header to UTF-8,
Test. Often this resolves the issue.
Use a Custom Servlet Filter
References
Servlet Filter patch to SiteMesh for html and non-server-side files - http://blog.sidu.in/2007/05/tomcat-and-utf-8-encoded-uri-parameters.html