Going global is a big step. Moving from the massive challenge of getting a company off the ground and past the initial challenge of proving that your idea can work has already put your company in a league beyond most.
Aside from incidental web traffic and interest from new countries and regions, for many companies, going global means setting up partnerships, offices and agents. For software, it also means internationalizing and localizing software so that it’s competitive and meets sales requirements. Internationalization (i18n) of your software is a business case driven undertaking, in response to opportunities and strategy. It is not like a feature that you might add in a sprint or two.
That said, all global targets are not equal in terms of technical requirements. This post gives you a brief overview of stages of software i18n and localization (L10n) and what opportunities each may open for your company. It’s not intended as a technical resource, but more as a primer for product and localization managers.
- Internationalization, often abbreviated as i18n (i – then 18 letters – n), is the process of making a single code base locale-independent so the application can be easily localized to other locales with no source code changes.
- Localization, often abbreviated as L10n (L– then 10 letters– n), is the translation and application of locale-specific terms and style so that a product is locale-specific – that is, it looks and reads like a product native to the market in which it is being sold.
- Globalization, sometimes abbreviated as g11n (g– then 11 letters– n), includes both internationalization and localization together and often refers to the entire process of supporting other locales.
- A locale in computing is a set of parameters that defines the user’s language, region and any special variant preferences that the user wants to see in their user interface. Usually a locale identifier consists of at least a language identifier and a region identifier. Consider that in both the US and the UK, the typical language is English, but other parameters such as date format, temperature and even some spelling is different.
Why Wasn’t It Internationalized in the First Place?
In a perfect world, all products would be created with i18n as a fundamental requirement from the start. But often with new product development, teams are just trying to make a product work and see if there’s a response. The initial focus is on relevance and acceptance of the application. Follow-on efforts are feature focused. I18n isn’t really like a feature, as its requirements underpin an entire application.
As mentioned above, going back and internationalizing code requires a business case. It takes time and will distract the development team from new features. There are times when products are rewritten, which is an excellent window to spend some effort on internationalization. In many cases i18n is a good opportunity for outsourcing, bringing in i18n expertise that allows your team to focus elsewhere, while still learning from the experts to ensure that future development will be internationalized. Expert help enables you to implement i18n faster with greater quality and less project risk.
Although people often think i18n is just about string externalization and the resulting localization, there is much more involved. There are locale frameworks that will govern locale behavior, methods/functions/classes that may need to be changed, static files to alter, and even hard coded patterns (e.g. a hard coded font) that may need fixing. Issues like date/time, address, phone number, numerical and measurement formats will have to adapt to local preferences. You’ll need character support, sorting changes and more. String externalization is like the visible part of an iceberg. You see it, but there’s much more below the surface.
Consumer Facing Software
It’s easy to see how consumer facing software has a higher requirement for i18n and L10n if it’s to gain broad acceptance, even in markets where English is more common.
That said, we do hear the argument that English is commonly used and understood in many places. Remember, if you travel to major cities in Europe, you’ll find that you can get along with English pretty well. But in terms of product preferences, people typically prefer engaging in their own languages. As you get out of major cities, you’ll find less English proficiency. Even with English (which English?) you still have formatting issues as mentioned, like decimal and comma placement in numbers.
We see the “English everywhere” argument even more with technical software. To an extent, if your users are technical (i.e., system administrators), you can use English in more markets. But you’ll fall flat in Asia Pacific countries, such as Japan, Korea and China.
Below is a broad summary of i18n phases or levels which can be applied depending upon the business case. The best is to be global ready for everywhere of course.
If your targets are in Western European languages (i.e. French, Italian, German, Spanish, Portuguese and more), you won’t need Unicode support in your application. That said, if you’re working on i18n, I like to recommend that people take on Unicode as early as possible. Most every modern database and programming language offers Unicode support, but you have to enable it. In our experience, Unicode support can be about 25% more work on an i18n implementation (there are exceptions), but remember that if you’re already in the code making i18n changes, it’s more efficient to do it now than restart the process later. This is a generalization, and there are plenty of specific application, business case and market driven exceptions. For example, somebody sold something and you have to deliver your product in Brazilian Portuguese in three months (true story!).
Unicode support will need to be a prerequisite if you have plans for markets/languages with complex scripts such as Japanese, Simplified and Traditional Chinese, Korean, Vietnamese, Hebrew, Arabic and Cyrillic based languages (i.e., Russian).
I18n Phase 1: Data
The very first internationalization priority should be the ability to input, process and transform customer data. In my opinion, this should be a benchmark requirement for any software that could have global customers or enterprise customers, whether or not localization is considered to be in the future. Note that the U/I is not changing in this phase. There is no U/I localization yet.
Even if you are actively selling only in your home country, it’s likely that you will run into customers in other countries, or your customers will have customers in other countries. At least let people enter data in multiple languages and formats. Store it, transform it and retrieve it without corrupting it.
Minimally, accents and diacritics shouldn’t cause character corruption (those square boxes and odd shapes you’ve probably seen). Better yet, add unicode support in the database and source code. Character corruption shouldn’t be caused via issues in the various components of your product source code.
Character corruption example:
Better if data can be stored and managed in variable formats for information such as date/time, numerical units, addresses, phone numbers and currencies.
Automation is the best approach to achieving this in a seamless, efficient and scalable manner. To that end, Lingoport’s Globalyzer can be used to scan your source code and database scripts to find these issues and guide developers to fix them. Our services team can perform refactoring work, as well.
I18n Phase 1.5: Locale Frameworks
You’ll need a locale framework for each programming language within your source code.
This paves the way for string externalization and presentation which will be needed for localization. Presentation of formats for date/time, numerical units, addresses, phone numbers, collation, currencies and more are also controlled by locale frameworks.
Lingoport’s services teams can help you make the right choices and even implement them for you.
I18n Phase 2: Language and Localization
String externalization is often what most people think of as the critical i18n and L10n step. User-facing words or strings are removed from being embedded in the source code and replaced with a function call typically to a resource file where the strings will now reside. This way, if the user selects a locale preference (remember those locale frameworks), French in France for example, the code will retrieve the French strings in the resource file for presentation.
String externalization can be tedious and time consuming. The issue is that lots of things may look like strings at the source code level that aren’t actually user-facing strings. Examples are named variables, debug statements and internal queries. Lingoport’s Globalyzer has default and extendable capabilities to aid these distinctions. Globalyzer Workbench enables an i18n engineer to assemble strings, walk through them and then externalize them in bulk.
You’ll also want to test your work. Lingoport’s Resource Manager will automatically generate a pseudo-locale that will help your team functionally test how the software will behave in another language, without the testers needing to understand that target language. Pad characters are added around the original English strings with expansion automatically set based on typical U/I requirements in the target languages. Alternatively, the expansion can be configured manually. This way, testers can immediately see any missed strings or U/I elements that won’t properly expand for likely longer words and changes in fonts in other languages (i.e., German, Chinese).
Pseudo localized page for a family tree application:
I18n Phase 2.5: Workflow
With some software, workflow and processes are different depending on market requirements. This takes market research and coordination with in-country representation. For instance, tax management, or medical administrative software is likely to have different requirements and steps in most markets.
I18n Phase 3: Bidi Support
If your product is being sold in places using bi-directional languages such as Hebrew or Arabic, you’ll need to enable and test your pages to support U/I mirroring and the bi-directional nature of text that goes right to left, but with left to right elements within. Unicode support is a prerequisite.
Ongoing i18n and L10n
Now that you’ve internationalized and localized your software, your work isn’t over. Your teams will be steadily releasing new features and functionality. I18n surprises can arise down the line that cost time and iterations to fix. It’s not hard for a developer to make a mistake. Just as your teams may continuously measure for coding quality issues and security, i18n quality now becomes another metric.
Localization for every sprint, branch and repository makes for tedious and error prone work that slows agile progress. That process can be automated, taking your developers out of the resource file update nanny business.
Lingoport Suite’s Globalyzer continuously supports i18n from the developer IDE to source repositories. Lingoport’s Resource Manager automates resource file updates from source to translation and back again, with quality checks in each direction. QA is supported as well.
Lingoport Dashboard lets teams see and manage i18n & L10n status and process, supporting i18n issue drill downs to associated source code, issue assignment and completion. Similarly, Localization resource file issues can be itemized and examined.
We’ve seen teams go from 5-week localization update cycles to under 3-days over hundreds of repositories. Our services teams have internationalized many well known applications ranging from small to millions of lines of code, and you would be surprised to see the efficiency gains that are achievable in the development process.
We hope that you find this primer useful as you look to address i18n and L10n of your own software products. If you have any questions, do not hesitate to reach out and a Lingoport team member will be happy to talk through the issues with you.
- 10X Globalization Webinar with FamilySearch’s Rob Thomas
- CA AgileCentral/Rally Software Webinar
- Fearlessly Leading Globalization Webinar with NetApp’s Anna Schlegel