Product Tip: Finding and Externalizing Strings in Large Amounts of Code

The Plunger Botton: Sucking Strings from Software

It’s a point of cavalier pride that we figure Globalyzer, a leading software internationalization tool, is the only commercial software that features a toilet plunger in its interface. Obviously that flies in the face of internationalization (i18n) convention regarding use of culturally sensitive images in software. But if you’ve ever had to find and externalize strings without Globalyzer, you understand the metaphor pretty quickly.

Finding and externalizing strings in large amounts of code without Globalyzer is repetitive, tedious, error-prone and really not very fun at all. It can cause a serious distraction from other product critical feature development and bog teams down.

The Plunger dates back to us sitting around and trying to think what image makes sense for string externalization. At first, the Plunger was a funny joke. Next thing we knew, we were paying a graphic artist to draw it up, along with the rest of our buttons. We all still get a chuckle out of it. But enough about us, here’s why that button is so important and how to make it work best for you:

One of the important productive contributions that Globalyzer can make to internationalizing existing code is accurately finding and externalizing interface messages, otherwise known as strings. For any readers that might not be familiar with what strings are and why they are a pain, here’s a simple explanation: strings are messages, words (and I’ll lump in images) that are part of the interface of a product. If these words, messages and images are left in source code, they present a technical challenge for a translator to implement a translation without breaking the code.

Plus, even if you do successfully translate without first extracting the strings, and you happen to be really lucky or talented and not break the code, then you end up with a whole new version of your code to support.

Years ago, it was more common to see companies make this mistake. Now we still see it as a legacy of companies having distributors or agents manage adapting their products for various locales. We do still see companies not realizing that as multi-locale data comes in and must be processed by their applications, things break regarding data storage and manipulation, in addition to just display issues, but that’s another story for another day.

Finding strings buried in tens, to hundreds of thousands of lines to millions of lines of code is challenging. Significant efforts were undertaken, and we undergo continual optimization within Globalyzer to solve that problem. It’s important to distinguish actual interfacing messages from programmatic issues such as database queries or debug statements. So Globalyzer lets you build and create special rules around string detection, in addition to providing many default detection and filtering capabilities.

Once you’ve found the strings, you need to put it in a separate file (e.g. properties, resources, .resx), and in its place, put a function in the code that says exactly where that string is, and tells the application to go get it. That’s where the Plunger comes in. Globalyzer’s GIDE interface let’s you visually inspect all the strings detected. You can move from string to string, while also linking a source code view. When you are ready, you simply select the string and hit the plunger button. The string is sucked out of the code, the command to get the string is put in its place, and Globalyzer generates and tracks numeric key values managing that string. All the string “bookkeeping” is done for you. Plus you can optionally insert a comment including the original string so you can see it in the context of your code.

Extracting Multiple Strings

Once you really get going on string externalization, you can use the multiple extraction Plunger button, shown above. You still need to visually inspect strings using Globalyzer’s GIDE to make sure that they aren’t concatenated or something you don’t want to externalize. However, this little button lets you externalize and automatically manage hundreds to thousands of strings at a time. Using Globalyzer, we’ve had customer development teams tell us that they could now find and extract in an afternoon, what had previously taken 6 weeks or more (plus costing release delays and the loss of hair), when they were doing it on their own, even when using simple utilities in their preferred IDE.

Even if you think you’ve already found and extracted all the strings in your source code, chances are good some have slipped through. In fact Lingoport is often hired to find and fix string issues in code that has been globalized previously. It’s just hard to find it all without a system like Globalyzer, and so strings sneak through, resulting in users seeing things like error messages in a language they don’t have command over. The result is a damaged perception of the product, plus a possible call to support.

Plunger Caveat

It’s important to remember that you still have to fix any string concatenation before extracting strings into resource, properties or resx files. Globalyzer provides help for that too.

String Extraction Supported Programming Languages

Globalyzer 2.3 supports string extraction for java, jsp, html, c#, aspx, asp, c/c++, php and Delphi programming languages. If you’re using something else, we can provide custom string externalization extensions to Globalyzer and do so in a timely and cost efficient manner.

Related Posts

Globalization Resources