Localization

Proper localization is a pervasive aspect of web application development. Supporting users from different countries, with different languages, can be a tricky proposition ... it is more than just text that must be localized, but more subtle aspects of the application such as date and currency formats. It is also more than text ... in some cases, a localized application will want to change images or even color schemes.

Localization support in Tapestry is likewise pervasive.

Component Message Catalogs

The most fundamental aspect of localization in Tapestry are component message catalogs (remember that pages are components too). A message catalog is a mapping from a logical key (that may appear in Java code or in OGNL expressions) to a literal string. Tapestry message catalogs are similar to Java's ResourceBundle class, except there is more flexibility in the character set of the files, and the location of the files.

Each component may have a message catalog, consisting of a set of localized message properties files.

These files are stored with the page or component specification file. They are named the same as the specification file, but with a different extension (".properties" instead of ".jwc" or ".page").

In addition, this is a set of files; a locale string may be inserted just before the extension. For example, WEB-INF/Home_fr.properties to contain the French language localization of the keys.

As with Java's ResourceBundle, resolution of a key to a message starts with the most specific properties file. Any key not found there will be searched for in less specific files. For example, the search path could be Home_fr_BE.properties , Home_fr.properties , Home.properties .

If a properties file does not exist, that's perfectly ok, the search will continue.

When a key can not be found even in the most general properties file, a search occurs in the namespace . In this way, very common strings can be stored and localized once, and used throughout a library or application.

We'll describe how to use the message catalog shortly, but first some notes on how the message catalogs are read.

Properties file encoding

For Java's ResourceBundle, the properties files must be in UTF-8 character set. This can be problematic, as in non-western languages it is necessary to use Java's native2ascii tool to convert from non-native files into an ASCII encoding of UTF-8.

Tapestry can read properties files in alternate character sets, but must be told what character set the file is encoded in (internally, the contents must be converted into standard multi-byte Unicode).

This is accomplished by providing some metadata inside the component (or page) specification. Metadata is specified using the <meta> element.

The resolution of the character set is somewhat complicated; it is possible that each properties file will use a different character set. At the same time, repetition is bad ... therefore it is possible to specify some of this information in the namespace meta data (in the containing application or library specification) so that it can apply to all pages and components within the namespace.

The basic meta-data property name searched for is org.apache.tapestry.messages-encoding . The value for this name is the name of the charset for the properties file.

However, the base name is modified to reflect the locale for the file being read; the locale string is appended to the key, thus org.apache.tapestry.messages-encoding_fr will define the character set for the file WEB-INF/Home_fr.properties

For each localization of the base property name, a search of the following locations takes place.

  • The page or component specification.
  • The namespace (library or application) specification for the namespace containing the page or component.
  • The global property source .
Because localization of templates is similar to localization of message properties files, a second search occurs if the search for (variations of) org.apache.tapestry.messages-encoding fails; this time for org.apache.tapestry.template-encoding occurs (again, with variations for each locale). The ultimate default for encoding character set is ISO-8859-1; in other words, the same behavior as reading an ordinary Java ResourceBundle.

Missing keys

While developing, you may occasionally reference a key that does not exist. Rather than fail with an exception, Tapestry will fabricate a missing key value. This is the key, converted to upper-case, and surrounded with brackets. For example, [A-MISSING-KEY] . This allows missing key values to stand out an demand to be fixed, without completely subverting your application.

Namespace message catalogs

It is very likely that you'll have a number of strings that are used, and re-used, throughout your application. Rather than duplicate the same message keys and localized values in all your page and component message catalogs, you can put these into your namespace catalog.

Each page and component is part of a namespace , identified by a library specification or component specification.

The specification may also have a message catalog; for instance, for WEB-INF/myapp.application , the files would be named WEB-INF/myapp.properties , etc. Again, the name of the file is based on the servlet name ("myapp").

Very simple applications may not have an application specification, but may still have properties, just as if the application specification existed.

Template text localization

As described in the discussion of Tapestry templates , static text in an HTML template can be enclosed in a specialized <span> tag.

Localized templates

In some cases, the entire layout of a page (or component) must change due to locale. For example, because of differences between western languages (which read left to right) and many eastern languages (which read right to left).

In this case, it is possible to have multiple HTML templates. If a localized template (e.g., Home_jp.html for a Japanese locale) exists, it will be used as appropriate.

Page and component specifications are never localized, just templates .

It is a good idea to make use of declared components, rather than implicit components, when using localized templates ... it reduces duplication in the templates.

Template encoding

Like message catalogs , each template may be written in a different character set.

For each localization of the base key ( org.apache.tapestry.template-encoding , a search of the following locations takes place.

Using the message: binding prefix

When specifying a parameter binding, the message: prefix is used to reference a localized message key. For example:
<html jwcid="@Shell" title="message:page-title">
 . . .
</html>

Localization of Assets

Assets may also be localized. Classpath and context assets will automatically search for a locale-specific match (this is very similar to how localized templates work).

Formatting messages

Messages may contain arguments , strings of the form {0} (or some other number). The argument are handled exactly the same as with Java's MessageFormat class (in fact, under the covers, MessageFormat does the work).

Components include a messages property for accessing localized messages. This property is of type Messages, and includes two methods:

  • getMessage() takes a string parameter and returns a localized message
  • format() takes a string parameter (the key) and then takes a number of additional parameters as arguments. The arguments are just objects. If you have more than three arguments, then specify them as an object array.
It is common to format messages using OGNL expessions, i.e.:
<span jwcid="@Insert" value="ognl:messages.format('billing-info', amountDue)"/>
The above example would get the amountDue property and pass it in as argument 0 to the message format retrieved from the message catalog as key 'billing-info'.

Changing the locale

In order to change the locale, you must obtain the IEngine and invoke setLocale() on it. This will change the value stored in the engine (which is used when loading new pages), and:

  • Update the hivemind.ThreadLocale service, allowing localized messages from services to be generated in the correct locale
  • Cause an HTTP Cookie to be added to the request so that future requests from the same client will be in the same locale
Changing the locale does not affect any pages loaded in the current request.

Engine locale vs. page locale

When pages are created, or obtained from the page pool, the engine's locale is taken into account. Pages are obtained when they are used by a service, or when accessed via IRequestCycle .getPage().

A page is loaded for a particular locale, and the page's locale never changes. This is because of the degree to which localization creeps into the properties of the page and the components within the page.

Additionally, once a page is loaded during a request cycle, it is kept for the duration of the cycle ... even if the engine locale changes.

If you have a listener method on a page that changes the engine's locale, it is necessary to activate a different page to render the response. This new page will be loaded in the new locale.

Note:

This may be addressed somewhat in Tapestry 4.0. Two options are possible: a service for changing the locale before rendering a page, and a way to force Tapestry to re-load a page, in a new locale.

Limiting accepted locales

By default, Tapestry accepts incoming locales (as specified in the request HTTP header) as-is. The requested locale is used as-is. This has some implications, primarily in terms of resource usage.

Imagine an application that is being accessed by users in the US, the UK and in Canada. The incoming request locales will be "en_US", "en_UK" and "en_CA" (respectively). However, it is likely that you will only have created a single localization, for English in general (locale "en"). Despite this, there will be several different versions of each page in the page pool: one for each of the above locales, even though they will be functionally identical.

Ideally, what we want is to limit incoming requests so that all of the listed locales ("en_US", "en_UK" and "en_CA") will be 'filtered down' to just "en".

That functionality is controlled by the org.apache.tapestry.accepted-locales configuration property . By setting this property to a comma-seperated list of local names, incoming requests will be converted to the closest match. For example, the the property could be configured to "en,fr,de" to support English, French and German.

Matching takes place by stripping off "terms" (the locale variant, then the locale country code) from the locale name. So "en_US" would be stripped to "en" (which would match). When no match can be found, the first locale in the list is treated as the default. In the prior example, Russian users would be matched to the "en" locale.