⏺️ "Mastering Enterprise Localization: Lessons from Siemens". Watch on demand.

Internationalization Engineering Planning: Secret Sauce

by Adam Asnes for Multilingual Computing

Just recently I got a call out of the blue from a colleague who leads his own internal internationalization (i18n) team at a well known software company, with many leading commercial products. The discussion particularly related to best practices and turning information into actual plans. I suppose the art of planning is kind of a “secret sauce” for any type of engineering. And i18n has its own special ingredients which need to be blended with their own puree of painful lessons. Seriously, i18n is dangerous stuff to estimate.

Here are a few reasons:

  • Requirements are notoriously easy to under estimate. People start just considering string management and then realize that’s just a small part of the full scope (see my other articles).
  • Code bases are typically very large and often you have limited history or connection to the people who wrote it.
  • Different programming languages, web servers, databases and platforms involve optimizing all kinds of encoding issues.
  • Internationalization issues aren’t easy to uncover and they are hidden in the code.
  • There may be all kinds of programming logic that will need to be rewritten as it just won’t work for multiple locales.
  • Architectural elements that need to be added, like locale operations or database changes, touch large amounts of the code, and tend to break everything.
  • The development team isn’t going to sit on their hands while the internationalization effort goes on – so you have two parallel coding efforts, one of which breaks everything (see prior item).

Any one of these issues has enough excitement to warrant an article on its own (and I may just take that path in the future), but it’s probably good to start on a high level describing some of the process with a few example questions and answers. What locales are being targeted and when? You can lump some aspects of target markets together by encoding. For instance, ISO-Latin 1 for Western European languages, Unicode for Asian languages. From there, you need a good idea of what the product in question actually does. How will the user need to set locale? Are address formats, phone numbers, dates, times, currencies, numerical units managed in particular ways? What are the various application tiers? How is data flowing from one part of the code to another?

Regarding those application tiers, are there whole sections of code that are out of scope? Could there be inherent danger in making them out of scope? What programming languages are used? There are drastic differences in how internationalization is handled among programming languages. Java and C# tend to be among the easiest with regard to i18n. PHP has gotten a lot better, but used to have no i18n framework. JavaScript is just a pain, as the very nature of how it’s used typically inspires all kinds of concatenation. C and C++ are typically difficult as there is just so much more involved with character set support, memory management and hosts of nasties like pointer arithmetic. On top of that, ANSI C/C++ is different than Microsoft C/C++. Many Microsoft products in most cases have their own special constraints. For instance with regard to databases Oracle will support ISO-Latin, UTF-8 and UTF-16 encodings. Yet Microsoft SQL Server is ISO-Latin or UCS-2 only (which happens to be nearly the same as UTF-16). The list issues as they pertain to technologies goes on, and on, but you get the idea.

You can break down planning a project in terms of:

  1. What’s not in the code that needs to be added?
  2. What’s in the code that needs to be changed?
  3. When does it need to be completed?
  4. Should parts of the effort be phased?
  5. What’s the budget?

The first question has everything to do with marketing, usage and technology requirements. If you miss requirements you will be late, and build something less desirable than imagined. What’s not in the code is broadly an architectural issue including everything from locale selection and operations, to the method of resource files being used. This takes good smart leadership which has been through i18n planning and construction efforts multiple times.

Question two is all about detective work. How are you going to find all the strings, methods and classes, programmatic logic patterns and more that have to be changed – yet lie buried in those hundred thousand to millions of lines of code? You can look at the interface and start to make guesses (the old way), or really count the issues, while locating and verifying them all with powerful diagnostic software tools. You can relatively quickly list all internationalization issues, view them, confirm their location, even figure the costs of translating the embedded strings with the right software.

Question three is all about marketing plans and revenue expectations. Often there’s a lot riding on target dates, with advertising, sales and customers waiting. Plus you need to factor in ample time for testing. In a perfect world, you would internationalize to the fullest scope possible, but budget and timing reality may mean a phased approach with some aspects left out of scope, depending upon application needs, customer requirements and locale targets.

Question four is often not fully known until the plan comes together. There may be a loose number assigned, but the specifics are a result of planning activities. Nevertheless, money is like oxygen. You’ll need a consistent supply if the project is to get finished without interruption.

Next comes the artistic finesse part. You have to put it all together into a plan. It takes experience to convert your data regarding requirements, architecture and code refactoring into a plan that optimizes the tasks, engineering team, schedule and costs. You could try applying hard metrics for this, like X number of issues means Y time, but often this is only a place to start. You have to plan for “surprises” and variations. Experience shows you where those tend to occur.

I suppose the chief service value that people buy from a quality i18n firm is that experience and its effect on risk, efficiency, time and expenses. Clients only looking to buy an hourly rate are missing the point.

Related Posts

What are the best approaches to localize UI strings?

Create an effective multilingual software UI with our tips on localizing and internationalizing software applications. Learn the difference between localization and internationalization, and how text strings play a critical role in user experience. Get started and develop a user-friendly UI for a global audience!

Globalization Resources