Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

The default character encoding for SiteMesh is ISO-8859-1. Also, SiteMesh assumes that the underlying servlet container encoding is also configured to ISO-8859-1.

ISO-8859-1 is generally intended for "Western European" languages. The Wikipedia contains a list of languages with complete coverage.

To support an international market, you must switch to the new standard UTF-8.

Note

UTF-8 adoptions has been an ongoing transition. Case in point, many Operating Systems, Databases and of course Servlet Containers still default to ISO even though they fully support UTF-8.

The process of using UTF-8 with SiteMesh requires adjustments to the following layers,

  1. Set Files and Workspace to UTF-8
  2. HTML
  3. JSP
  4. SiteMesh

It may also require changes to the following,

  1. Servlet Container
  2. Server Operating System
  3. Server Database if one is being used

Following this tutorial it will be clearly evident if UTF-8 is working. As such, we will stop at each layer to see the results of our changes. It may be the case that the other layers already default to UTF-8.

Create a UTF-8 Test Pages

The very first step is to create a UTF-8 page and identify if there are issues.

Warning

This page is not yet complete and being actively written.

Verify the Encoding

Load page in Firefox 2.x or higher and click View, Character Encoding. The dot in the menu will indicate which encoding is being used by Firefox.

Adjust for SiteMesh

HTML Code

Inform the browser that the page contents are of a specific character set. You should do this anyway as a general practice.

This is done by specifying a meta tag in the HEAD element of the html page,

Code Block
languagehtml/xml
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>

Test to see if this resolved the issue. If not continue to the next step.

JSP Header

Set the JSP response header to UTF-8,

Code Block
languagejava
<%@ page language="java" contentType="text/html; charset=UTF-8"
	pageEncoding="UTF-8"%>

Test. Often this resolves the issue.
Use a Custom Servlet Filter

References

Servlet Filter patch to SiteMesh for html and non-server-side files - http://blog.sidu.in/2007/05/tomcat-and-utf-8-encoded-uri-parameters.html