Apple’s iOS Character Bug and Internationalization (i18n)
It’s rare that an i18n issue makes the news. In fact, in a meeting a few years ago with an i18n and localization team, one member lamented just that. “Nobody gets fired for a character corruption issue. That just doesn’t make the news.” The context was that security issues get the attention and with that gobs of budget. There is nothing like the fear of a breach, lawsuits and public humiliation coming with an i18n or localization (L10n) shortcoming. However the problems can still be insidious.
Let’s look at Apple’s iOS bug. It actually did make news, but could have easily been missed in this week’s tumultuous news cycle.
As reported: When a user inserts a particular character from the Telugu language (India) on iOS or MacOS, the system crashes hard. This can even take place from within applications running on those systems. The character looks like this:
i18n and L10n Matter
Going back to our security bug comparison, let’s consider carrots and sticks. Security is a stick. Don’t handle it right and you get beaten. But if you perform i18n and L10n well, and you have a good product, you’re going to see a different kind of reward. This is absolutely no different than the benefits of paying attention to usability in your user interface. Software that works and behaves elegantly, has a competitive advantage with applications that may not. So it is for products that work well in any language and locale preference. Yes, there are some markets that are more US English tolerant, but they will still need all kinds of other locale formatting for a multitude of data like dates, numbers and addresses.
Want to grow? Go where the people are. Look at Facebook, with 87% of its users outside North America (including US, Mexico and Canada). Netflix has been growing consistently based on expanding their presence worldwide. Even consider technical products, whose managers perhaps had the excuse that system administrators have to learn English anyway. Have a listen to our webinar recording with Anna Schlegel, who leads Globalization efforts at NetApp. They place great strategic importance to their i18n and L10n efforts as a critical product strategy, and not just a checkmark.
If It’s So Important, Why Is It Hard?
There are many reasons why both i18n and L10n can be challenging. Developer teams are tasked with steadily implementing new features and fixes. I18n requirements are often not fully understood. You have organizational turnover, far flung teams, disparate understanding of i18n, and the fact that perfectly functional development may deliver the feature, but not the locale requirement. Testing may or may not ever directly relate to i18n and L10n within the same sprint. Then figure that you have multiple sprints and often source control branches being fired off concurrently. Even more experienced companies with their own internal technology investments tend to have a jury-rigged series of scripts that still depend on human action and are subject to process error and delay.
This is where our Lingoport Suite software for managing i18n and L10n can make for significant gains in quality, time to market and development savings. It is natural that if you find i18n issues during the day’s development tasks, you can fix them easily and quickly without any backlog or impact on your development velocity. Same goes for localization changes. Automate those, and you can stay right on target and never have to search for changes and updates to files. If one little string changes, it’s no big deal. The updated translation is automated out and back into your code. Plus it’s all made visual via dashboard and controllable even through collaboration tools like Slack. Even the QA team has continuously updated test cases, so that US English (or whatever your home language may be) is just another locale.
Back to the Apple Bug
I did a little reading plus reached out to a few experts beyond our team, to pin down what might have happened at Apple. Their situation is probably not a simple case of using some deprecated function or locale unsafe class. Apple does support Unicode after all. It’s a bit surprising that one Unicode character out of some 55,000 in the first Unicode plane would cause such problems. As a (useless) guess, something is going wrong at the OS level when the character is processed and displayed. Perhaps the character processing algorithm, in this case, leads to a buffer overflow. Even if you don’t expect to converse in Telugu, nefarious types are using the character in text bombs to disable Apple devices. A fix is forthcoming.
Why This Matters:
It’s unlikely that you’ll run into a bug this complex within typical application development. But this is an excellent illustration that it’s far less painful to get i18n and L10n right before release. Have good systems for finding and fixing issues like embedded strings, concatenations, functions/classes that aren’t locale safe, character encoding bottlenecks, and programming patterns that mess up your intended results. Then take out the file nanny busy work around localization updates so every sprint is easily localized. Win over your worldwide customers with software that’s up to date with their own preferred locale behavior and language.
Bug report: https://www.openradar.me/37458268
Interesting analysis: https://manishearth.github.io/blog/2018/02/15/picking-apart-the-crashing-ios-string/