Worldware Presentation – Bringing I18n to MT Development: Challenges, Solutions, Case Studies

The affect of machine translation (MT) in the globalization industry has been astounding do to MT’s ability to cut costs and shorten the time to market for products. With growing demand for MT, the question is posed as to how MT applications are able to overcome new linguistic and technical challenges (such as internationalization) and how these problems are being addressed by companies using machine translation.

Presented by:

  • Olga Beregovaya, CEO of PROMT Americas, the Enterprise division of PROMT

Lingoport Webinar: Supporting Internationalization Across Your Enterprise With Globalyzer 3.4

Recording Available Below

There is tremendous value in knowing if a product is global-ready as part of your development cycle. Large amounts of development, marketing and branding dollars are at stake. Yet often, the only way software gets verified for localization, is during the localization process itself, or based on a limited series of manual interface testing. That’s way too late in the development cycle to be efficient and a very incomplete way to address the issue.

There are all kinds of products to support issues like software security and efficiency, but how about checking on internationalization, which for many companies is a hefty and vital product requirement for a good share of company revenue?

In this webinar, we’ll be demonstrating how Globalyzer 3.4 (our new release) finds, categorizes, tracks and helps fix internationalization bugs in source code using static analysis.

Webinar: “Supporting Internationalization Across Your Enterprise With Globalyzer 3.4”
Date: Tuesday, November 30th, 2010
Time:
11am – Noon PST
Where:
Your desktop
Watch at:
http://vimeo.com/17364680
Cost: ComplimentaryPresenters: Adam Asnes and Olivier Libouban of Lingoport

We’ll start with some source code and then:

  • Analyze it for internationalization issues
  • Customize “rule-sets” so that specific issues to that code can be address
  • Show how that information can be accessed and shared among development team members
  • Integrate automated Globalyzer static analysis via command line
  • Support testing initiatives

The Webinar targets technical managers, software engineers, test engineering managers, QA managers, internationalization and localization managers, and anyone facing ongoing software globalization and localization challenges.

Note: We’ll be diving straight into coding issues and will be skipping internationalization basics. If you’re looking for a presentation on internationalization and localization basics, please visit this archived presentation from Localization World: http://vimeo.com/16345751

About the presenters:
Adam Asnes founded Lingoport in 2001 after seeing firsthand that the niche for software globalization engineering products and services was underserved in the localization industry. As Lingoport’s President and CEO, he focuses on sales and marketing alliances while maintaining oversight of the company’s internationalization services engineering and Globalyzer product development.

Olivier Libouban, a native of France, has been working for 25 years in the software industry, for large corporations and start-ups, as a software engineer and as a project manager. Olivier has a wide ranging experience in the US, France, Switzerland, and Norway, in R&D departments as well as for client projects of all sizes with complex software environments.

The Need for Internationalization (i18n) in Administrative Solutions: A Case in Point with Region Centre

By Olivier Libouban, Software Project Manager at Lingoport.

A Region is an administrative layer in France, with elected officials, getting tax Euros, and setting up programs and initiatives for the EiffleTowercitizens. Part of the responsibility of any region is also to provide software solutions to the citizens. Part of the responsibility of any region is also to provide software solutions to the citizens, with significant budgets: the IT department of any Region manages bids, responses, and supervises the implementation of the solutions.

A case in point for “Region Centre”, situated close south west of Paris, is the need for an e-learning platform, dealing amongst other things with budgets, financial institutions, training institutions and citizens able to register and follow classes, either on-site or on-line. The request for proposal of such programs is sent by the IT department and gives the context, the functional needs, and the requirements at large for this type of program, including strategic technologies, such as Portal by a specific vendor. The entire platform may be composed of a large number of software components, in this case ranging from the software infrastructure pieces, such as Web application server, LDAP, and databases, to specific functional components, such as an e-learning tool to be integrated in the overall software and hardware platform.

The IT department oversees the responses to the request, and solutions which do not play in a French locale cannot be accepted. All components must behave and interact with each other, be it in terms of encoding, of searches, of collation, of UI presentation to citizens, training institutions, financing institutions, administrators of the system. In other words, the budgets for an administrative program are targeted at i18n compliant software.

Those administrative programs might be at a city level, a county level, a region level, a national level, even at a pan-national level, such as with the European Union, which serves citizens of Europe at large. The combined budgets of those IT departments are simply very large and can only be applied to i18n solutions.

Video Recording of LocalizationWorld Presentation: Intro to Internationalization and Localization

Internationalization and Localization experts Adam Asnes, of Lingoport, and Angelika Zerfaß, of zaac, recently presented at LocWorld in Seattle. Their session “Intro to Internationalization and Localization” was moderated by Daniel Goldschmidt, principal consultant and cofounder of RIGI Localization Solutions, and is now available for online viewing.

The one-hour recording of their presentation provides an overview over the different areas in internationalization and localization projects where best practices exist — starting from the concept of internationalization and how it is applied to project management dos and don’ts and the tools and technologies used in the field.

The business case why US companies need to internationalize their software in order to sell to the Canadian Government

In Adam Asnes’ article in the September 2010 issue of MultiLingual, he illustrated how business cases for US companies can drive their need to internationalize their software in order to sell to the Canadian Government, or to sell broadly in Quebec. I liked in his article how he mentioned that companies may adapt their software because of sales-driven reasons rather than part of a broad global marketing initiative, which have “different needs-drivers reflected in deadlines, resources and scope” than regular, consistent localization projects.

Adam goes on to describe very well, for both the techie and sales person alike (me for example), what needs to be completed to get the software localization-ready and how Lingoport rocks at helping companies with that process. Here at Milengo, we assist clients with their language support commonly after Lingoport has finished their work. And we too notice clients’ needs for Canadian Language support is different when it is deal-based, rather than as part of a broader sales plan, so I too will focus my ideas on that part. I wanted to use this blog to illustrate some examples of projects we’ve worked on to give readers ideas on what processes and technology are available and what is do-able, to help stretch your budget when sales lands a big new deal in Canada.

Let’s make the assumption that your company is doing very well and the software you produce is awesome. Sales are booming in North America. The Sales Director got a big contract with the Canadian Government. Big deal and big money. It’s signed after the champagne has been popped, you’re told that you have 3 months to deliver a Canadian French version, with documentation, since it’s required by law in Canada. And if it’s late, the company will have to pay a fine for every-day its late, eating into profits and good will. So after a big gulp of bubbly, the process begins.

Luckily, you know Lingoport already from Adam’s excellent articles in Multilingual. His company helps your developers in completing the i18n of the software so that it can be localized. He did it on-budget and before he promised, just because that’s how Lingoport rolls. Milestone 1 completed. Then you see you have about 10,000 strings for translation as well as help and user manuals, which require about 200,000 words for translation. Oy vey. The volume is too much for your staff in Canada to do it internally within this timeframe. What options can you consider?

Option 1: Have an LSP do the translation for you. Luckily, your sales team collaborated with you closely and the deal was priced to allow for high-quality human translations in Canada. You can create a glossary from the software translation, which forms a bed-rock for future updates. Consistency in your software, documentation and customer communication is recorded and used across all documents, lowering costs, increasing quality and enhancing the brand experience (a big topic that we’ll go into another time). Sounds good, right? With all those happy French-speaking Canadian customers, it may get you thinking that a more developed localization strategy might not be a bad idea after all?

Option 2: Your sales team did not collaborate with you, and the overall price of the package sold was too low. Your manager is balking at the double-digit figures for the cost of the documentation localization since the budget is not available and you have limited financial resources. Alternatively, perhaps its not a priority to have this done with high-quality human translations since this is a one-off deal. Options to consider include:

  • One of Milengo’s customers had some 1.5 million words of help-desk and customer support information that needed to be translated in a month a half in order to outsource call-center operations. Do-able? Yes! Did they have a budget of ~ $500,000? No. To get around this we worked with our partner AsiaOnline to develop a customized, enterprise-level statistical machine translation engine that uses sophisticated algorithms to provide machine translation results. To make the translations publish-ready, human linguists reviewed the machine translation output to correct errors, fix stylistic problems, etc so that it looked and felt correct. The overall saving was over 50%.
  • You want to leverage your in-house team of people in Canada, but need to make them more efficient. How about taking the glossary from your UI and use it as a basis within the Google Translator Toolkit? The Google engine will produce a translation for you using your glossary as a reference point, and afterwards, your in-house team can correct and fix the errors and improve style. Or you can have an LSP like Milengo do it for you. Depending on the nature of the content or corporate culture, if may not be appropriate, but it is an option that you can consider. Google is doing more and more of their own translations this way, and we’ve helped them with correcting the output of their translations using their own toolkit.

Option 3: You can do a mishmash of all 3 above. The UI is translated by your in-house staff (i.e “the humans”) since they are the experts. The documentation is translated by AsiaOnline’s customized statistical machine translation with human post-editing, and Google Translator Toolkit is used for internal communication in Canadian French <> English.

Option 4: While the above mentioned scenario is unlikely since you are internationalizing your software for the first time, if you did have a French translation, we could leverage that considerably. An adaptation from Continental French to Canadian can be done. While both languages are French, there are of course differences and copy-editors can go through and change terminology, style and make the local feel local, saving considerable time and budget.

There you have it. Of course each option, scenario and client requirement is more complicated and detailed than portrayed here, but hopefully it gets the juices flowing in terms of what can be done.

Post written by Adam Blau, Rebellion Leader at Milengo, a global language services provider.

The Business Why and How of Simship

This article was originally featured in the July/August 2010 issue of MultiLingual Computing Magazine, in Adam Asnes’ Business Side column. 

The subject of managing releases over worldwide markets can be a contentious one, with pros and cons on either side of business and development cases. The concept of simship is that if you are releasing your product to worldwide markets, you do it all at once rather than first releasing to your home market and then following with localized versions later. I can’t say that any one approach is right for all organizations, business situations and products, but I can share with you some of the organizational, procedural and business issues that contribute to successful simship global releases.

When a company commits to product releases that serve a worldwide customer base, there’s a long shadow cast on revenue, marketing, sales teams and of course development practices and testing. It’s a challenging logistical undertaking to release software products in multiple markets, requiring well-integrated planning and practices. It’s no wonder simship is viewed alternatively as difficult and impractical to the best thing a company can do. Let’s consider a few of the issues within any organization, starting with the business case.

Internationalization and localization are always in pursuit of a business case, and one exists both for and against simship. That said, the business cases tend to vary based on the global perspective and maturity of the company. The case for simship is strongest among experienced global companies. Their revenues are already global, so delaying releases for localized versions only serves to delay resulting new release revenues. There may be good reason for adding secondary tiers for some local release schedules, but products really should be internationalized, with a clear path for localization and testing within the development path. In practice this isn’t the reality, but there’s quite a bit of agreement and successful data on the business case existing for simship with this class of company.

When companies are relatively new to global markets, they generally tend to put less of an emphasis on simship with new releases, and more of an emphasis on market or business agreements as drivers for their efforts. Perhaps they have a new customer or distributor that must have a localized version. In that case, synchronizing new version development with localization is usually—but not always—an afterthought. This is because the company sees its prime revenues being driven by current product customers. New releases boost sales, renewals and competition, so that connection is strongest where the current customers are. We’d still argue that even under these circumstances, simship should not be pushed aside, as there are gains to be made both for revenues and operations.

Time and Revenue Projections

Attached to initial time to release and revenue opportunities are quarterly and annual growth numbers. If a product is expected to grow sales by percentages outlined and expected in a marketing plan over months, quarters and years, significant delays in turn make those projections difficult, if not impossible, to meet. Delays add up to real dollars. Now let’s leave the business case behind and look at software development organizations. It is extremely common among both development and localization teams to view localization as a tail-end process. But this is a critically limiting perception if your company is committing itself to serve global customers. Practically, a company shouldn’t build a product with a requirement as major as supporting multiple locales as a tail-end process. Even in cases where legacy code is now being first internationalized for global customers, once that adaptation is complete, from then on localization should be included as an expected part of the development process. That means including requirements for planning, architecture, development implementation, testing and release.

I asked my internationalization colleague Tex Texin to add some words about this. He seconded that as with many other aspects of globalizing applications, development organizations tend to see just the work and delay to releasing their product and not the benefits. And although we work to plan to minimize the pain, there is cost to achieving simship. However, exercising the localized versions often uncovers critical problems in the product core that can require urgent updates, recalls or even the creation of specialized tools to repair customer data in the field. In that context, simship is not only a requirement to be in the international markets and significantly enhance revenue, but is an important part of product testing preventing problems that are costly to repair and damaging to both reputation and future domestic sales.

Tactics

Simship nearly always seems to be the outcome of an internationalization implementation. So, we have some experience working with legacy code that we are internationalizing and then merging with concurrent new development, building localization proactively into the process.

We find and work with the localizable content embedded in the code first. We gain a clear estimate of localization costs by examining those strings, even while they are still embedded in the code using static source analysis. That’s important because it allows the budget and financing mechanisms of an organization more time to accurately fund the localization. Then we systematically provide externalized strings for localization as we go along in the project, rather than waiting until the end. We also perform static analysis on concurrent new feature development so that when we merge legacy and new code, we minimize the risk of expensive surprises. We build functional internationalization and localization test cases and execute both. The internationalization functional testing can be performed by testers regardless of linguistic proficiency. However, because we have been localizing all along, we are also quickly ready for linguistic testing. The combined processes are extremely effective in finding both functional and linguistic defects that may have passed through if performed as an afterthought.

Agile Development: It’s one thing to talk about including localization into your internationalization and development process on large-scale efforts, but what about smaller scale and rapid agile releases? Turns out it’s really no different. I talked to Mike McKenna, globalization manager at Yahoo!, to get some perspective. An extreme example is the release cycles for Flickr, Yahoo!’s photo sharing social network. Flickr sometimes rolls out four to six releases per day, holding the expectation that developers can get immediate access to translations they may need, likely to be small UI changes. Then they pride themselves with directly connecting their developers to users, without intermediaries, to fix issues that may arise from localization or functional changes.

Yahoo! has other software, such as its Open Strategy Platform or Yahoo! Application Platform, which typically have six-week release cycles. In this case, there is a UI freeze before the release sprint so that localization can be integrated into the final release sprint. Developers work with their localization managers and ensure any last-minute tweaks that may become necessary to the UI during the release sprint are well coordinated.

Security: Let’s go back using our timetunnel to the 1990s: Windows 95 was first released in August 1995, its first service pack was released in February 1996 and the second pack in 1998. The localized versions were always lagging behind: Microsoft first released the “Enabled“ version, which was not localized but could run software in your language. A few months later, Microsoft released the localized version. Today, Microsoft and other companies release security patches on a monthly basis if not on a weekly basis. Can you imagine releasing the patch in North America first and only a few months later in the rest of the world? Simship enables the release of security patches and other critical patches on a timely basis to all markets and prevents security glitches.

Internationalization as Enabler

The success of localization and the ability to coordinate simship processes are directly dependent upon the quality of a product’s internationalization as well as the development team’s ongoing internationalization practices. Internationalization is the software development enabler, and without it or without a consistent internationalization benchmark, localization and particularly simship get broken. As the saying goes, garbage in, garbage out. Simship takes a little more planning, time, tools and coordination, but it’s hardly an onerous process. Like a lot of things, your organization has to be aware of the benefits and just do it. Then the actual doing is clearly achievable.

About the Author

Adam Asnes is President and CEO at Lingoport and enjoys investigating how globalization technology affects businesses expanding their worldwide reach. Adam is a sought after speaker at industry events and a columnist on globalization technology as it affects businesses expanding their worldwide reach. He often writes articles for localization, internationalization and globalization industry publications and enjoys cycling and Colorado’s Rocky Mountains; he can be reached by clicking here.

Lingoport’s Internationalization (I18n) and Localization (L10n) Tools and Consulting Solutions

Founded in 2001, Lingoport provides extensive software localization and internationalization consulting services. Lingoport’s Globalyzer software, a market leading software internationalization tool, helps entire enterprises and development teams to effectively internationalize existing and newly developed source code and to prepare their applications for localization.

For more information on how Lingoport can assist you with all of your internationalization and localization needs, please contact us at info@lingoport.com, call 303.444.8020, or complete the quote request form.

What If Internationalization Expectations Exceed Your Budget? – Significantly

Note: This article is featured in the June 2010 issue
of MultiLingual Computing Magazine, in Adam Asnes’ Business Side column.

If you’re considering internationalizing a large and complex software product, there’s one thing you should be prepared for: it’s expensive. There’s just no way around it if you want an application that properly presents, inputs, transforms and reports complex data. I’m talking about applications measured in the hundreds of thousands to millions of lines of code. Seriously, you’re just not going to internationalize a sizeable application that you’ve taken years to develop with money just laying around – unless you have a lot of money laying around, which is pretty rare these days. But before we consider what to do about it, let’s consider the main reasons why you may need to internationalize:

Survival –

Your customers are increasingly global, and perhaps they use your product to reach their customers. If you’re not internationalized, you’re limiting their business. The competition and your customers will know this and will eventually eat your company alive. You’d better start finding some money.

A Sale –

There is nothing like an important customer to get an initiative moving. If this sale funds the internationalization effort, it makes things easier, though there will be commitment that will extend beyond any one customer. I’ve written before how changing your encoding will change your company. But if this sale doesn’t pay for the effort, corporate initiative will be needed.

Your company is global –

Perhaps your company is a global brand and you’ve quickly developed or acquired a product that isn’t internationalized. In this case, the decision to internationalize is usually simple. You do it because you already have a global reputation, sales and distribution. If you have to justify ROI, somebody is missing the point, there’s a temporary issue or the product isn’t showing promise.

Strategic Initiative –

This article isn’t going to be about all the strategic benefits of growing global revenues with products that leverage themselves worldwide, because you know all about that, right? But acting on strategy takes foresight, money, expertise and perseverance.

If you have any of the above situations except budget, this article is especially for you.

I’ll repeat a situation I’ve seen many times. My firm, Lingoport, will be called upon for initial consulting as a company is considering internationalization in reaction to a declared strategic objective to gain business outside a home market. They usually have one or two customers asking for just that, but perhaps there isn’t enough initial interest to finance the necessary development and localization. We go back and perform static analysis on the code using our Globalyzer software, counting the embedded strings, locale-limiting methods/functions/classes and programming patterns that will need attention and refactoring, combined with architectural changes to support locale and changes in processing.

Even with automating tasks for batch efforts like string externalization (after analysis), you still have design, engineering and testing cycles that add up to significant expense. At this point we find out just how strong corporate global resolve sits. And in some cases that resolve is just not quite ready. It’s not a lost cause by any means. In fact, almost always, it’s just a matter of time and resources and most come around in future quarters or fiscal years. But there lies the gap for development managers.

Rarely do developers internationalize software just because it would be cool. You do see that kind of initiative for new features, where a developer might get an idea, work on it during odd or even personal time, and voila, present it to his or her company peers. I have yet to see that happen regarding internationalization (write me if you see otherwise). Still, developers and management often know the need to internationalize is there; ready to become a firm requirement any quarter now. They can go on continuing to develop new features and update current code and not go near internationalization, but actually increasing the scope of the internationalization effort as they grow the code base. Or they can take some simple steps to get ready. To use an expression, “When you find yourself in a hole, stop digging.” Here’s a brief list of what you can do:

  • Gather requirements – new locale requirements will go much further than what languages will need to be supported. An architect can be tasked with learning about issues like character encoding and locale frameworks. A product marketing person can learn a bit about use cases and business logic that may alter how the product behaves in new countries. It is all too easy to underestimate the requirements phase. Locale behavior will involve quite a bit more than just string externalization. Start tallying and recording what is found in a centrally available resource, like the company wiki for all to build upon and learn about.
  • Prototype a string retrieval method. Learn about resource files and string ID’s and how to make them work. Again, list your results in the company wiki.
  • Do a little reading about Unicode and its various encodings, along with appropriate technologies for their use. It’s not enough to commit to using Unicode. You have to gain some understanding of just what that means.
  • Consider your database schema and how that might change for locale support along with likely changes to character encoding.
  • Consider any third party components or open source you use within your application. Start inquiring about their internationalization support.
  • Consider internationalizing a pilot effort or component of your software if your product architecture will permit it. There’s nothing like learning by doing. And if you decide to take a somewhat different approach later, it probably won’t be too difficult to alter what you’ve already done.
  • Refine your planning – as you learn more, your planning efforts are likely to get clearer. As plans get clearer, they seem less risky and large. You’ll be in a better position to defend expected costs, resources and schedules.
  • Consider application logic. Does your software manage a process that is performed differently around the world?
  • Talk with experts – It’s not prudent to try and reinvent the internationalization process. An experience expert, who’s really been through multiple implementations rather than just advising, can get you prepared faster and cheaper than the time it will take using your internal developers. I’ve seen companies create their own proprietary approaches that ultimately get in the way of a successful implementation. Initial consultation shouldn’t be a budget buster. Even so there are free internationalization webinars (we give them and others do too) and excellent conferences available (i.e. Worldware and the Unicode Conferences).
  • Start measuring toward your expected outcome – If you establish internationalization development practices and measure benchmarks, you are likely to see improvements to new development without significant cost in time and money. Static analysis tools like Globalyzer create a systematic approach, but if there’s no budget, then a simple and clear inclusion of practices and expectations can go a long way.

If you do at least some of this prior to any funded but highly likely internationalization requirement, you’ll be a tremendous asset to your firm’s globalization efforts. And globalization might just be one of the more significant and company-making undertakings that your firm can embark upon.

Internationalization ROI

Note: This article is scheduled to be featured in the August/September 2009 issue
of MultiLingual Computing Magazine, in Adam Asnes’ Business Side column.

It’s easy to get agreement that revenues beyond a company’s home country market are important. If you look at some of the great global US brands, you’ll find that global revenues are 50% or even greater than 65% of their gross. While much has been made of measuring the return on investment for localizing software, what about measuring the very process of making software which is internationalized so that it can be localized and supported worldwide?

There are lots of issues to measure, and they vary in emphasis for the company which is making its first efforts outside its home market, to companies that have highly evolved processes for global releases.

First we must consider opportunity costs, backing up marketing and sales efforts, competitive pressures and right down to cost of engineering. Now typically ROI calculations get down to hours saved at a particular rate, which is certainly valuable information and usually those numbers are paramount to analyzing any kind of process changes. But if a company is making new efforts or experiencing painful delays in global releases, opportunity costs and major market factors are deal makers and the stuff that executive level directives are made of.

Internationalize or Die

This heading may sound dramatic, but it’s quite the case for some of our clients. For instance, we have a client whose software platform is used by third parties in e-commerce efforts. Many of their accounts are well recognized names in retail and merchandising, who are beginning to look at markets outside the US as important to their brands. While our client is not interested in purchasing localization themselves, if they can’t make their product support data management and presentation in multiple languages and locale sensitive formats, they will lose their customers to competitors. I asked their senior management what was at stake, and they replied nothing less than their company’s future growth and survival. Given that this a billion dollar company, I’d say that’s a pretty big opportunity cost ROI on an internationalization effort.

Opportunity Costs

Internationalization happens because it’s first and foremost a business driver. I have yet to meet the development team that decides to internationalize just because it would be an interesting task. So I think it’s appropriate to first consider business drivers outside of the development process itself.

Perhaps global sales efforts have been taking place with a US English product. Outside of development, there are costs of sales, marketing personnel, supporting distributors, legal and administrative costs to name a few. These all have expensive price tags, which are independent of having an internationalized and localized product. And an internationalized and localized product has been shown to make those representative costs far more effective at producing revenue.

Cost of Delays

In an earlier article and subsequent whitepaper on our site, I outlined the cost of being late. The quick summary is that the marketing team will typically have projected revenues for each market, but dependent upon release criteria. If a product is a single quarter late, which is not bad for a large project for some software development teams, they just lost a quarter of their year for the sales teams to meet those projections. What’s the value of one quarter of sales effort? If those sales efforts are expected to produce increased results over time, how does that roll out and effect market penetration in future years? While these are broadly variable scenarios, I always like to consider the “top end” revenue implications before beginning to count development hour savings. The top end always has far broader consequences and those opportunity costs get very real with numbers followed by many zeros in a competitive world.

Cutting Development Costs

Catch Bugs EarlyMy company, Lingoport, has just released Globalyzer 3.0 which is aimed squarely at supporting entire development organizations. It’s actually the only commercial system of its nature, purpose built to support a very broad list of programming languages, measuring, filtering, reporting, tracking and even fixing internationalization issues over the development processes via its client, server and database components. Companies have products to measure coding quality, security issues, memory management and more. Now we are adding static analysis of internationalization to the source code development process. Remember, if so much revenue is riding on global markets, doesn’t it make sense to actively measure and aid software globalization issues, just as much as software security issues? Why not check source for embedded strings, locale-limiting methods/functions and classes, Unicode compliance, Font issues, i18n limiting programming patterns and the like at regular automated intervals rather than waiting until QA or localization? Remember the management principal that if you want to improve anything, measure, track and report it as close to its creation as possible. What gets measured gets done.

Cost per i18n Bug – Case Study with Mature Localization Practices

In working with a new client, which is already quite mature in their localization and internationalization efforts, we had the opportunity to get actual ROI data, based on real internationalization bug fixing costs they had measured over 60 localized products. After cleansing that information of confidential data, they gave me permission to share it though limiting the data to results from 17 products.

Traditionally, they have been finding internationalization bugs during internal and external localization QA testing efforts, including both Psuedo-Localization (creating fake translations for testing purposes) and actual localization testing performed by both their organization and vendors. They counted five organizations touched by internationalization errors: Localization Vendor QA, Localization Project Management, Internal Localization QA, Product Development QA and Core Engineering. The process goes something like this:

  1. Internationalization bug is discovered and reported during Localization
  2. Project Manager tracks the bug, may enter or flag it in a bug tracking
    system
  3. Core Engineering, which likely has moved on to other efforts by
    now, must assign and fix the bug
  4. Product Development QA must verify the
    fix and any other issues the fix may have affected
  5. Additional Localization
    efforts may need to be made for the same issue

This iterative process gets pretty expensive. Remember that a maxim for software development is that the earlier you find and fix bugs, the less expensive. Fix a bug before a QA cycle, and you save multiple people having to process that bug in some way, and retest the solution. Need to fix a bug after release? Costs get much worse. This principal is a major contributor to the popularity of moving to agile development cycles, so that you enhancing and verifying software in smaller, successful, less expensive cycles.

Our client figured on an average of 25 internationalization issue bugs per release, an average of 10 hours spent cumulatively by the five groups per bug , with an average of 60 releases per year over these 17 products. Some products had zero i18n bugs reported, others had over 100. The business case for finding internationalization issues in source code as part of regular automated processes integrated into their build cycle gets very clear at this level. They estimated savings of $420,000 per year, just on reducing localization QA costs. By finding the issues early, total product development savings were calculated to be over $760,000 per year.

Internationalization ROI Chart

Remember that even maturely localized products, still have regular new release cycles, which in turn create the potential for new internationalization issues. Product Development never really stops, and teams tend to be more broadly geographically distributed than ever before. That makes measurement tools all the more valuable for localization savvy companies.

Cost per i18n Bug – Case Study, Product has Never Been Localized

When you consider companies engaging in early globalization efforts, the payback simply multiplies per product as you can expect the i18n bug count to go way up. Without a tools-based approach to finding and fixing issues, internationalization will be very heavily trial and error iterative. One can write a few scripts which will take considerable time, research and effort, and still likely produce unreliable results. Then you can pseudo-localize display strings after you’ve found as many as you can and externalized, or populate the database with target encoding data. You would then test, test and test again while you had to hunt down the issues one by one in the source. This only multiplies the cost per i18n bug. By finding issues first at the source level, you can actually begin to orchestrate their correction, tying directly to that issues precise location within hundreds of thousands, or even millions of lines of code. And that’s an intelligent way to find and remove a needle in a haystack.

The table below illustrates the costs of i18n bug iterations for a single product of about 500,000 lines of code during the first internationalization effort. This table doesn’t include additional costs of researching and implementing various scripts and homemade utilities to help the work get done. It also doesn’t take into account that a tool like ours actually isolates i18n issues, pinpointing them in source, while also facilitating batch externalization of strings – both very tedious and time consuming activities. Consider that even a simple error message that gets missed using traditional scripts and trial and error, may not show up at best during late QA efforts that force the error to appear, or worse, after product release. We commonly hear that it takes three or four localization releases to weed out those sorts of issues that get missed so easily. That is why this table lists a higher i18n bug rate for 2 subsequent releases than the table used for the localization mature company earlier in this article.

Internationalization ROI Chart

Pitfalls and Adjustments

I think it’s fair to say that no tool offers a panacea. The strike against coding quality checkers in general has been complaints about over reporting errors, often referred to as false positives. It’s true that if you overload a developer on data that is only partially relevant, that data risks being ignored. That is why any enterprise scalable solution must include dynamic ways to filter results, share those filter controls and track them over time. You also must have flexible detection, so that you can add unique parameters that invariably crop up and can be quite particular to a specific code base.

New processes may not be greeted with enthusiasm by development teams which are typically already over tasked and under-resourced, so it’s important to help them understand the meaningfulness of getting global releases out faster and with higher quality. Automating code checking and reporting during a regular process like a periodic build is an excellent way to track and highlight progress.

Enterprise Internationalization and Automation

There are some technology companies where thinking globally has been fundamental to their operations for years and years. I’m referring to companies like IBM, HP, Yahoo, Google and the like. These companies all made significant investments in their global infrastructure, sales teams, products, development and strategic planning. It didn’t happen by accident. And as these companies develop new products or acquire companies, they look to leverage them across that global infrastructure quickly and profitably. Global companies are good prospects for my company in our internationalization products and services business, because they tend to be more experienced in their understanding of engineering challenges, knowing that it takes people, tools, time and money to globalize software so that they can gain the best return on their product distribution and sales infrastructure.

One very potent way to make software globalization fundamental to a company’s mindset is to make internationalization a fully integrated and automated part of software development practices. There are all kinds of tools, checkers and environments to help developers create interfaces, access and transform all kinds of information buried in databases, support coding constructs, manage memory and perform application modeling. With that in mind, we’ve been hard at work with a major new Globalyzer release, clearly aimed at supporting entire development departments and enterprises, automatically using batch processes on servers to monitor internationalization progress as well as on the desktop where issues can be individually examined and fixed. While that has always been our aim, we’re now getting there in more robust ways that track internationalization status over time over multiple programming languages and even over multiple products.

Globalyzer i18n software

For those non-developers reading this, let me explain what I mean about automation in this context. When engineers create code, they generally all submit their work to a code repository. This repository provides version control so that when multiple engineers are all working together, they can check code in and out and merge together all their changes. Then the code has to be put together and built. This build process usually occurs on some interval, such as nightly or even on a continual basis. During this automated process, you can also automatically check for many other issues like performance, load balancing, and I’m proposing that this is a great time to check on internationalization/localization readiness by running tools on the code automatically as a batch process, which then tracks issues via reports. Now counting issues is one thing, but you can go even further by showing exactly where a problem exists in the code, along with the context of the errant issue. That information can then be brought forward for quick review and fixing.

Two companies which come to mind, doing this very thing are Intel and Yahoo. Michael Kuperstein of Intel, presenting at the WorldWare Conference in March, reported how his team developed their own internationalization toolkit a few years ago and have integrated it into many of their automated build processes. That automation has made internationalization an important and measured component of their ongoing development efforts. By Mike’s own admission, he would have used Globlayzer had he known about it years ago.

Mike McKenna of Yahoo also reported at WorldWare that his globalization team is using automation, in this case Globalyzer, to measure internationalization benchmarks on development teams.

Globalyzer, a leading internationalization tool support software internationalization, Java internationalization, and software localization.

On the localization product side, there are multiple tools for different aspects of managing words. But when it comes to products which support an enterprise in their software internationalization efforts, there is a pretty empty playing field. Aside from some very simple string externalization utilities in a few development environments and frameworks, our Globalyzer is simply the only commercial software I know of that can automatically monitor development over time over a wide range of programming languages, while also stepping entire teams through internationalization fixes in large amounts of code.

I’ve said a few times in my columns that I’ve found that it’s quite powerful to embrace the management principal that whatever gets measured gets done and improved over time. So it follows that one of the most important aspects of any software development undertaking is that you measure desired outcome over regular intervals. If you just hope that it will all come together in the end, you always end up late and over budget. That is ultimately behind the agile and extreme programming development movements, in that you make more frequent intervals of measurement and goals. But it’s not so easy to track something like internationalization, either as a project where you are refactoring software for new globalization requirements or even for ongoing development. Consider that developers are typically over tasked, and often distributed across time zones and continents. Then factor in that internationalization can be quite subjective to a particular development task. Plus internationalization is a fuzzy thing, in that it is tailored to requirements, technologies and special cases. So what development teams grapple with how to handle it, and make their way through the task by brute force – or simply postpone or avoid internationalization whenever possible. Issues get missed, and if you’re lucky, you have an iterative process during localization to fix internationalization bugs, which is a very expensive and time consuming path. Or worse, development ignores the issues and calls it a localization problem.

I spoke with a company in just that situation last week. They were upset with their localization provider for poor quality, but when we examined some of the issues, there were also extensive internationalization mistakes that were sure to break localization context and execution. These included missed strings and extensive string concatenations. Had they been monitoring these efforts all along, and been clearer on internationalization requirements, they would have had better results and a clean release. The biggest costs to them were poor market entry, customer dissatisfaction and complaints from their distributors and sales teams which had to overcome a poorly localized release. Now I also feel that as vendors we have some responsibility in taking care of clients and not selling them a solution that risks poor quality and a weak market entry, so some blame also goes to the localization provider. But I hardly know what really happened, I was just there to offer help in picking up the pieces. Clearly that’s an expensive route in many ways.

Remember, internationalization is often run by a different crew than localization. Software developers are upstream from localization, and they are sometimes all too disconnected from a final localized product releases. Localization is often someone else’s problem and engineers are focused on getting a release out with all its new features. They don’t know what they don’t know, which is only human. That leaves localization teams waving their arms around trying to get the developers to build software right the first time. And those teams likely have no way to measure if the product they are tasked with for localization actually passes internationalization muster, until they go through localization testing. Again that’s very late and expensive in a software development process, and more often than not, localization testing tends to be underfunded and vendor dependent. You’re going to have trouble finding everything. So for localization teams, what I’m suggesting is to consider a kind of automated litmus test. When code comes to the localization group, scan the code for internationalization issues, and consider what’s found. The technology is now there to do this in detail and examine each potential issue, quickly and easily. So at the worst case, you can at least have engineering fixing internationalization bugs during the localization process rather than when it’s far more expensive.

Again, anything that measures and sheds light on the situation will also have the result in making improvements. So if you want well globalized software, better start measuring how that code is developed, not just what it’s costing to localize it.

P.S. I’m thinking of writing a column on funny ways people fell into the localization business. If you have a good story you’d like to share, please contact me!

Corruption! Creating an ìèíèñòð Opportunity

by Adam Asnes, President, Lingoport
As appeared in Multilingual Magazine

Chances are you’ve seen corrupted data, but perhaps didn’t think too much about it unless you’re a localization engineer. Most people see it first in their spam, coming with promises of Euro-Lottery millions or other nefarious offers. The corruption evidence is in the square boxes or random nonsensical characters that fill the subject heading or email body, if you haven’t deleted it already. What’s happening is that somewhere along the way, or in your mail client, the character encoding the message is written in is not being supported. Obviously you wouldn’t feel very confident using a product, site or system that suffers this same issue, so it’s a clear defect. Sometimes you even see it when everything is still all English, most notoriously when somewhere along the way the software system you are using can’t process a simple apostrophe.

Internationalization tools support the software localization process.Remember that all data on computers ultimately breaks down to zeros and ones. These values are then interpreted to form characters and then strung together as words or symbols. Corruption occurs when the interpretation of the encoded zeros and ones does not form the intended character. For example, the application thinks the encoding of a character is ISO-Latin 1 rather than UTF-8 and so displays the wrong character. We have run into several internationalization services customers over the years that have inadvertently corrupted character data buried within large databases. Here’s an example of how bad this can get:

Imagine your company is a world leader for building heavy machinery and construction equipment. You have a massive parts catalog. Over time, an unknown amount of data has experienced character corruption. The characters are no longer humanly readable. They look like gobbledygook. Or, you have a complex online customer management system with a large database of users and corresponding account information with broken character encodings sprinkled throughout.

In each case there are too many occurrences peppered throughout the data to review and manually decipher what the original intent of the content was. You can imagine the panicked conversations when the broken characters are discovered. “Oh σηιτ, look at this! How the φυχκ are we going to fix this!”

Often the instances are too scattered and it’s too difficult to roll back to previous versions of the data, as everything new would be lost, and it may not be known just when the character corruption might have started happening.

The corruption occurs in the first place when there’s some source in the application or process or reviewing data breaks the encoding. For example developers may have implemented a web page form that isn’t properly set up to return data in the correct encoding. Another possibility is that someone manually imported new data into the database, but used an editor that is not set up to handle, say UTF-8 encoding. The culprit might be as innocent as using Notepad incorrectly.

At this point, this conversation has happened with clients several times a year, and in every case, these clients already happened to be working with us in some capacity, whether on service projects or licensing our Globalyzer software. I suspect the problem isn’t actually all that uncommon. So we finally decided to take some of the advice I’ve been trumpeting in this column and productize some of our solutions. At the time of this writing, we haven’t decided on a product name yet, so we affectionately call this solution The Decombobulator. We’ll probably officially release it as something boring like db Ambassador, but we’ll always call it the Decombobulator internally because it sounds funnier. Check our website to find out if humor or practicality wins out (remember that we are probably the only company using an icon of a toilet plunger as part of an interface and utility names like PseudoJudo). In fact, I encourage you to contact me if you’d like to vote on it or suggest a better name.

So here’s how we solve this problem. The Decombobulator runs on your data or database, reviewing characters at the byte level and reporting the results. It then helps you compare character encoding to the intended encoding and then reports, suggests and helps automate the correction back to what the character was intended to be.

Here’s an example using corrupted names from a database which initially had problems with some cases of extended characters:

Internationalization tools can help prevent character encoding corruption.

I’ll add that we’ve seen strings that clients have submitted to their localization vendor which also have the same types of instances of corruption. Often this happens when someone opens a file, just to check that the data is there in the first place, but then saves it again without the proper character encoding settings. The localization firm then has a number of isolated strings, perhaps including past translations, which are now broken.

I’m not illustrating all this as a sales pitch. I somehow doubt we’ll sell very much of the Decombobulator, but for the people that need it, it will be a lifesaver. In fact, much of the development and productization of the Decombobulator happened without my knowledge and even in part against my intentions. One of our team just took it upon himself to take extra time while getting his other work done, to enhance what we had and put it together. I bring this all up because in your business, you likely encounter some problems just like this which are just begging for a repeatable and scalable approach that will make you a savior to your client or coworkers. And if you can repackage it for the benefit of your organization or clientele, you’ve just created a significant differentiating value. That’s what people love to buy, whether it’s you selling your continued employment or cementing a client relationship. This doesn’t mean you learn software development on the side if you’re not a developer. Every process presents its own opportunities.

The economy is rough out there. I won’t bother parroting what you’re no doubt reading. It may be that one of the few bright spots is still the language services and technology industry. I talk to quite a few CEO’s of localization companies and they all seem to be reporting that business is holding up, but they are crossing all their fingers and toes that it stays that way. If I were in the automobile or furniture business in the US, I’d be beyond scared. But the fact is that the entire language computing industry directly connects to helping technology firms make more money. Notice I didn’t say save money. While that’s important too, making money always wins. So the way that we differentiate our industry and for our clients and co-workers is by innovating in ways that get work done faster, better and cheaper, so that someone can sell something more effectively anywhere in the world. And that’s just great business.