i18n Terminology Guide

Understanding i18n terminology is an important part of the internationalization process. By learning what key i18n terms mean, you’ll be better equipped to communicate effectively with developers, translators, and designers to ensure that your software is adapted efficiently for different global markets.

Agile is a project management and software development methodology that breaks projects down into sprints and emphasises adaptive planning and continuous improvement.

An Agile approach is beneficial for the i18n process as it enables software design and development that is adaptable to different languages, cultures, and regions

An acronym for the American Standard Code for Information Interchange, a character encoding standard that was developed in the 1960s by the American National Standards Institute (ANSI).

As one of the most widely used character encoding schemes for representing text in computers and communication systems ASCII is especially relevant to effective i18n.

BiDi is an initialism for Bidirectional text referring to the handling and rendering of languages that use both left-to-right (LTR) and right-to-left (RTL) scripts within the same context.

BiDi support is a crucial aspect i18n, particularly the display and readability of languages such as Arabic, Hebrew, Persian, and Urdu, which predominantly use RTL scripts.

Bitbucket is a web-based hosting service for version control repositories, primarily using Git or Mercurial. Bitbucket provides a platform for software development teams to collaborate, manage source code, track changes, and facilitate code reviews.

While Bitbucket itself is not directly related to the internationalization (i18n) process, it can play a role in supporting the i18n workflow and collaboration within development teams.

A character set is a defined collection of characters and symbols used by a particular writing system or language encompassing alphabets, numerals, punctuation marks, symbols, and other graphical elements.

Character sets, sometimes called Character Repertoires, play a vital role in the i18n process by supporting and representing the characters needed for different languages and scripts.

CJK

CJK is an acronym for Chinese, Japanese, and Korean, a group of languages sharing similarities in their writing systems, character sets, and typographic conventions.

CJK languages present unique challenges in the internationalization (i18n) process due to their complex and diverse character sets.

Code branching is the creation of separate versions of a software codebase or code repository to test new features or work on bug fixes.

In the context of i18n, code branching can create separate versions of the codebase for each language that the software is translated into.

Concatenation combines or joins two or more strings, text fragments, or variables together into a single string. It is a common operation in programming and text processing, allowing the creation of longer strings by merging shorter ones.

Concatenation can cause i18n issues with string variables, pluralization, gender, and subtle changes in formatting, which are often difficult for developers to identify without the appropriate tools, resulting in costly delays.

Crowdsourcing is the practice of obtaining ideas, services, or content from a large and diverse group of people, typically through an open call or invitation. It involves distributing tasks or problems to a crowd or community, often facilitated through online...

Culturalization is adapting a product, service, or content to a specific culture or target audience, considering cultural preferences, norms, values, and expectations.

Culturization ensures a product or content is culturally appropriate and resonates with the target audience. Culturization is a subset of localization, which is the broader process of adapting a product or content to a specific locale or market.

An embedded string, also known as an interpolated string or template string, is a feature in programming languages that embeds expressions or variables within a string literal. Embedded strings are a convenient and concise way to include dynamic or computed values without the need for complex string concatenation.

Embedded strings are relevant to software localization, which often involves translating user-visible strings into different languages.

Character encoding represents characters from diverse writing systems (such as alphabets, ideographs, and symbols) as binary data that computers can understand and process. Character encoding assigns numerical codes to each character, enabling their storage, transmission, and display in digital systems.

Character encoding is essential to internationalization (i18n), ensuring that text can be accurately represented, stored, transmitted, and rendered across different platforms, devices and software.

g11n

g11n is a numeronym for the term globalization, which, in the localization industry, is adapting products or services to suit multiple target markets and cultures on a global scale. 

g11n encompasses strategic and operational activities aimed at making products or services culturally and linguistically suitable for different regions and countries.

Git

Git is an initialism for “Global Information Tracker,” a distributed version control system (VCS) allowing multiple developers to collaborate on a project, track changes, and manage source code efficiently. Git was created by Linus Torvalds in 2005 to support the development of the Linux kernel and has been widely adopted across platforms such as GitHub, GitLab, and Bitbucket.

Git repositories are crucial for i18n workflows, allowing teams to easily share and collaborate on code.

GitHub is a popular Git repository, a distributed version control system (VCS) allowing multiple developers to collaborate on a project, track changes, and manage source code.

GitHub is popular with i18n developers, given the platform’s additional collaborative features, such as team collaboration, code review, project management, and community engagement.

GitLab is a web-based DevOps platform hosting Git repositories and supporting the entire software development lifecycle, including source code management, continuous integration and deployment (CI/CD), issue tracking, project management, and collaboration.

GitLab is similar to GitHub and, given the value of the platform’s features, is also popular with i18n developers.

Within the localization industry, Globalization (often abbreviated to g11n) is adapting products or services to suit multiple target markets and cultures on a global scale.

Globalization encompasses strategic and operational activities aimed at making products or services culturally and linguistically suitable for different regions and countries.

Globalyzer is a software internationalization (i18n) and localization (L10n) management tool developed by Lingoport to help software development teams identify and address internationalization issues in their codebase, making it easier to prepare software applications for localization and global markets.

Hard coding embeds data directly into the source code of a program or other executable object, as opposed to obtaining the data from external sources or generating it at runtime.

Hard coded strings, especially those containing user-visible text, present challenges during the localization process as application text is difficult to extract and translate.

i18n

i18n is a numeronym for the word internationalization, the process of designing and developing software applications or products in a way that allows them to be easily adapted and localized for different languages, regions, and cultures.

I18n makes software capable of handling diverse linguistic and cultural requirements.

ICU (International Components for Unicode) Formatting is a powerful library and set of APIs providing comprehensive support for formatting and manipulating internationalized text, dates, times, numbers, and other data in software applications.

ICU formatting plays a crucial role in the internationalization (i18n) process by ensuring consistent and accurate data representation across different languages, regions, and cultural conventions.

An ideogram or ideograph is a symbol that represents an idea or concept, independent of any particular language, and specific words or phrases. Some ideograms are comprehensible only by familiarity with prior conventions; others convey their meaning through pictorial resemblance...

In-context reviewing looks at translated content within the context of the original content, side-by-side with the translated content.

In-context review is an important part of the i18n process as it allows the reviewer to identify any potential problems, such as inaccurate or unnatural-sounding language.

An Input Method Editor, often abbreviated to (IME,) is a software component allowing users to input text in languages standard QWERTY keyboards don’t support, such as East Asian languages, with a large number of characters.

IEMs use key combinations combinations to output characters.

i18n is a numeronym for the word internationalization, the process of designing and developing software applications or products in a way that allows them to be easily adapted and localized for different languages, regions, and cultures.

18n makes software capable of handling diverse linguistic and cultural requirements.

Often represented by the numeronym L10n, localization refers to adapting a software application, website, or product to meet the language, cultural, and functional requirements of a specific target market or locale.

Localization makes necessary modifications to the software, content, and design to ensure that it resonates with the target audience in their native language and correct cultural context.

Legacy code is existing software code that is outdated, or in use for so long that it may no longer meet modern coding standards or practices. Legacy code is often written in older programming languages, uses outdated frameworks or libraries, or lacks proper documentation.

As legacy code is typically difficult to maintain, modify, or extend it can make the internationalization (i18n) process harder.

LLM

Large Language Models, often abbreviated to LLM, are a type of artificial intelligence (AI) increasingly used in localization for automated translation and localization tasks, natural language processing and content generation.

A locale refers to a set of parameters that define the cultural, linguistic, and formatting conventions specific to a particular region or language. Common locale parameters include language, date and time formats, number formats, currency symbols, and other regional preferences.

Locale is crucial to i18n as the parameters are referenced by software applications and operating systems to adapt their behavior and presentation to the specific needs and expectations of users in different regions.

Localization of software, often referred to as L10n (where “10” represents the number of omitted letters between “L” and “n”), is the process of adapting software applications or products to make them linguistically and culturally suitable for a specific target...

Localyzer is a Lingoport software product that bridges the gap between software development and the localization processes.

Localyzer is geared toward organizations launching their software in new international markets, automating laborious localization tasks such as the exchange of resource files and completed translations with vendors or translation teams.

Machine Translation, often abbreviated to MT, is the automated process of translating text or speech from one language to another using computer software or algorithms without human intervention.

MT uses computational methods and linguistic models to analyze the source language and generate an equivalent translation in the target language.

Merging is the process of combining changes from one branch of source code with another. Merging integrates the code changes made in one branch, often referred to as the “source branch,” into another branch, typically known as the “target branch.”

The merging process allows developers to consolidate changes and resolve conflicts, playing a significant role in the i18n process by incorporating localized content into the software.

A numeronym is a number-based word often used as an abbreviation to simplify long and often clunky technical terms. L10n is a numeronym; the 10 refers to the number of characters between the letters l and n in the expanded word – localization.

i18n (internationalization) and g11n (globalization) are other examples of numeronyms widely used within the localization industry.

Pseudo-localization is a technique used in internationalization (i18n) to simulate the effect of translating software or content into another language and assess the quality of the outcome.

Pseudo-Localization replaces the source language strings with placeholder text that mimics the characteristics of the target language, such as length, character sets, and formatting.

A pull request is used in version control systems, such as Git, to propose changes made in one branch of a repository to be merged into another branch. Pull requests allow for collaboration and code review among team members.

Quality Assurance, commonly called QA, ensures that a product or service meets specified quality standards.

In the context of i18n, QA plays a crucial role in verifying the accuracy, functionality, and usability of localized software or content across different languages and cultural contexts.

Scan rules, also known as scanning rules, are a set of predefined or custom-defined rules used in internationalization (i18n) and localization processes. Scan rules identify and extract translatable strings or other elements for localization.

Simship, short for simultaneous shipment, is the common practice of releasing a product simultaneously in multiple regions or markets in the software and gaming industries. Simship includes coordination of the release in different languages or for different locales simultaneously.

A category of online video games featuring social interaction, cooperation or competitive play, often in a multiplayer format.

As social games are designed to be played by a global audience, internationalization is essential to ensure a seamless gaming experience, including language, culturalization, date/time formats, adhering to regional restrictions, and optimizing server bandwidth.

Software localization, often represented by the numeronym L10n, is adapting a software application, website, or product to meet the language, cultural, and functional requirements of a specific target market or locale.

Software localization makes necessary modifications to the software, content, and design to ensure that it resonates with the target audience in their native language and correct cultural context.

A sprint is a time-boxed set of planned tasks or backlog items within Agile software project management, development and testing.

The integration of sprints with localization workflow can have a significant impact on the success of i18n efforts.

Static analysis is a software development technique that analyzes the source code of a program without actually executing it.
Static analysis examines the code structure, syntax, and i18n elements to identify potential issues, such as coding errors, security vulnerabilities, performance bottlenecks, and compliance violations.

In computer programming, strings, also known as character strings, are sequences of letters, numbers, symbols, and whitespace characters providing user instruction or meaning within a software application.

Strings are crucial to the i18n process as they must be extracted, translated, and localized to adapt software or content for different languages and regions.

Subversion (SVN) is a software versioning and revision control system used to maintain current and historical versions of files such as source code, web pages, and documentation.

Though Subversion doesn’t inherently handle i18n tasks, it can play a crucial part in the i18n workflow.

UI strings, also known as user interface strings, are the text elements behind a software application’s user interface, including labels, menus, buttons, tooltips, error messages and dialogue boxes.

UI strings convey the meaning of elements of the software interface, guide user actions, and, in the context of the i18n process, must be carefully adapted to ensure the software is effectively localized.

Unicode is a character encoding standard providing a universal representation for every character used in human languages, symbol systems and punctuation marks. Unicode enables the consistent and unambiguous encoding of text across different platforms, applications, and languages.

Unicode plays a crucial role in internationalization and localization by ensuring the accurate and consistent representation of text in different languages and scripts.

UTF-16 (Unicode Transformation Format 16-bit) is a scheme within the Unicode standard for encoding and decoding text representing characters in 16-bit code units.

UTF-16 uses one or two 16-bit code units to represent a code point, which is a numerical value representing that character.

UTF-16 plays a significant role in internationalization (i18n) through multilingual support, string handling, input and output, and file and data exchange.

UTF-8 (Unicode Transformation Format 8-bit) is a scheme within the Unicode standard using variable-length sequences of 8-bit code units.

UTF-8 is important to the internationalization (i18n) process in relation to encoding compatibility and flexibility, efficient storage, multilingual support, web compatibility and interoperability.

The Waterfall Model, also known as the Waterfall Methodology, is a linear and sequential approach to the software development process, like a waterfall flowing down step-by-step.

The Waterfall Model can pose some challenges for the i18n process as it can limit flexibility and collaboration, as well as delay the localization process.

XLIFF is an initialism for XML Localization Interchange File Format, a standard format used in the localization industry for data exchange.

XLIFF provides a standardized way to package and exchange translatable content, including text strings, formatting, metadata, and context information, making life easier for translators and localization teams.