Mastering ICU Message Formats: Best Practices and Pitfalls to Avoid

Table of Contents

In today’s globalized world, creating software applications localized for different regions and languages is essential. The benefits are immense; developers can cater to the unique needs of diverse user bases, and their applications can penetrate new markets and foster cultural sensitivity. Furthermore, localizing software ensures compliance with regional regulations, thus contributing to a successful, inclusive, and legally compliant product. 

So, what’s the best way to achieve this? 

Enter the ICU Message format — a highly useful format for Unicode strings that elegantly helps developers solve some difficult i18n challenges. Through this format, developers can ensure accurate translations, and that web software content adapts to cultural nuances, ultimately enabling software localization. 

This article delves into resource files and their role in customizing software applications while introducing the ICU format. We will also explore the importance of application variables and share best practices for using ICU messaging. Learn how to create accessible, culturally-sensitive applications that transcend language barriers and unite the digital world. 

Resource Files and Localization 

Resource files contain the source or translated versions of user interface text, labels, messages, and other language-specific resources used in software applications. Developers can use these files to customize the interface of an application and make it suitable for use in different regions or countries with different languages and cultures. The best way to achieve this is to use standard formats for resource files, such as properties files for Java, resx files for C#, or JSON files for JavaScript. 

Typically, each resource file contains a list of key-value pairs that associate a unique identifier (key) with the corresponding translation (value) for that identifier in the target language. The identifier could be a label on a button or a menu item, a message displayed to the user, or any other text element in the application’s user interface. 

Here’s an example of a simple key value-pair in a properties file in

English: LOC_INTRO=Localyzer automates the translation process

And the corresponding one in French: 

LOC_INTRO=Localyzer permet l’automatisation du processus de traduction 

Variables and ICU Message Format 

Adding application variables into localized text is essential for software. By dynamically inserting values into the text elements, developers can get applications to effectively display context-specific information while adhering to language-specific formatting and translations. This approach ensures a seamless user experience across diverse regions and languages. 

Here is a simple example in English: 

LOCALYZER_READY_PROJECT_NAME= Your Localyzer Project ${LzProjectName} is ready

And the corresponding one in Turkish: 

LOCALYZER_READY_PROJECT_NAME=${LzProjectName} adlı Localyzer projeniz hazırdır

The variable ${LzProjectName} gets replaced with the appropriate value during runtime.

Next, let’s explore another aspect of localization – pluralization. 

ICU messaging format plays a significant role in handling plurals in different languages. In most cases, the rules for pluralization vary depending on the language. For instance, English has singular for ‘one’ item and plural for all ‘other’ items, including zero. In contrast, Arabic employs ‘zero,’ ‘one,’ ‘few,’ ‘many,’ and ‘other’ rules, while Ukrainian uses ‘one,’ ‘few,’ ‘many,’ and ‘other.’ 

Here is an example of an ICU Message Format in English: 

results.count={count, plural, one {# result} other {# results}} 

Here, the count is a variable in the application. The application will display “1 result” if the count is one. If the count is 17, the application will display “17 results.” 

And the corresponding one in Ukrainian: 

results.count={count, plural, one {# результат} few {# результати} many {# результатів} other {# результату}} 

This approach enables the application to display the correct plural form while allowing translators to provide translated messages that may include additional selectors, such as ‘few’ and ‘many.’ This way, the correct pluralization ensures both accurate communication and cultural sensitivity. 

⚠ Caution 

The ICU message format offers several use cases, including: 

  • Pluralization 
  • Gender-specific messages 
  • Text replacement 
  • Complex language constructs 
  • Nesting 
  • Format numbers, dates, and times. 

However, developers must focus primarily on using the ICU message format for pluralization. While other use cases mentioned are possible, handling these complexities directly in the code helps maintain a clear separation between the responsibilities of internationalization (i18n) and localization (l10n). 

By keeping i18n on the development side and avoiding burdening the localization side (translation) with these complexities, a more streamlined and efficient process can be achieved, resulting in better-quality translations and a smoother user experience. 

Recommendations and Best Practices 

Many developers like using the ICU format mechanism for other purposes, as the selection looks very much like a switch case. We strongly recommend against it. Why? Remember, the purpose of the ICU is to help with translation, not any other goal, not even other i18n fixes. This means developers should program switch cases in the application, not in resource files. 

For instance, avoid using ICU to determine date formats based on the locale. Instead of using the locale as a switch variable to dictate formats like ‘mm/dd/yyyy’ in English or ‘dd/mm/yyyy’ in French, rely on existing i18n libraries or packages designed to handle date formatting. 

Addressing gender can be challenging in ICU messaging formats, as languages assign gender differently to entities. An entity may have varying genders across languages, such as a cat being ‘masculine’ in French (le chat) and ‘feminine’ in German (die Katze). Besides date formats, developers must avoid using gender in ICU formats, as demonstrated by the German word for girl, “das Mädchen,” which is classified as ‘neutral,’ not ‘feminine.’ 

Additionally, while multiple selectors are permitted in a selection tree-type format, avoiding this approach is best. Such complexity can lead to errors from translators and developers and may cause TMSs difficulty parsing the message. 

Wrapping Up 

The ICU Message Format is an invaluable tool for developers seeking to craft localized messages compatible with various platforms and languages. To ensure effectiveness, it’s crucial to keep the format usage simple for both development teams and translation agencies—focusing primarily on pluralization. As such, developers need to consider any additional use cases carefully. 

Ready to enhance your software localization process? Check out Globalyzer and Localyzer

With Globalyzer, you can check for internationalization (i18n) issues as developers write their code. Get i18n input in development environments, during commits and pull requests, and over entire repositories. Writing well-internationalized code is the most effective way to avoid localization issues. 

With Localyzer, you can check the structure of the ICU format on the source and the translated strings. Even better, Localyzer automatically fetches strings from your software repositories for translation and deploys them upon completion. Simply set it up and watch it work seamlessly. 

Explore the benefits of Localyzer and unlock the potential of your applications for global audiences. Contact us today to schedule a demo. 

Author

Picture of Lingoport
Lingoport
Hand with mobile phone

Talk to Our Experts

Get you tailored consultation on i18n & L10n products and services.

Related Posts

Receive Latest News and Webinar Dates

Get the latest internationalization news delivered to your inbox.