Internationalization and localization

In computing, Internationalization and localization (also spelled internationalisation and localisation, see spelling differences) are means of adapting computer software to different languages and regional differences. Internationalization is the process of designing a software application so that it can be adapted to various languages and regions without engineering changes. Localization is the process of adapting software for a specific region or language by adding locale-specific components and translating text.
Due to their length, the terms are frequently abbreviated to i18n (where 18 stands for the number of letters between the i and the n in internationalization, a usage coined at DEC in the 1970s or 80s[1]) and L10n respectively. The capital L on L10n helps to distinguish it from the lowercase i in i18n.
Some companies, like Microsoft and IBM, use the term globalization for the combination of internationalization and localization.[1] [2] Globalization can also be abbreviated to just g11n[citation needed].
Scope
Focal points of internationalization and localization efforts include:
- Language
- Computer-encoded text
- Alphabets/scripts; most recent systems use the Unicode standard to solve many of the character encoding problems.
- Different systems of numerals
- Writing direction which is e.g. left to right in German, right to left in Persian, Hebrew and Arabic
- Spelling variants for different countries where the same language is spoken, e.g. localization (en-US, en-CA, en-GB-oed) vs. localisation (en-GB, en-AU)
- Text processing differences, such as the concept of capitalization which exists in some scripts and not in others, different text sorting rules, etc
- Graphical representations of text (printed materials, online images containing text)
- Spoken (Audio)
- Subtitling of film and video
- Computer-encoded text
- Culture
- Images and colors: issues of comprehensibility and cultural appropriateness
- Names and titles
- Government assigned numbers (such as the Social Security number in the US, National Insurance number in the UK, Isikukood in Estonia) and passports
- Telephone numbers, addresses and international postal codes
- Currency (symbols, positions of currency markers)
- Weights and measures
- Paper sizes
- Writing Conventions
- Date/time format, including use of different calendars
- Time zones (UTC in internationalized environments)
- Formatting of numbers (decimal points, positioning of separators, character used as separator)
- Any other aspect of the product or service that is subject to regulatory compliance
The distinction between internationalization and localization is subtle but important. Internationalization is the adaptation of products for potential use virtually everywhere, while localization is the addition of special features for use in a specific locale. Internationalization is done once per product, while localization is done once for each combination of product and locale. The processes are complementary, and must be combined to lead to the objective of a system that works globally. Subjects unique to localization include:
- Language translation
- National varieties of languages (see language localization)
- Special support for certain languages such as East Asian languages
- Local customs
- Local content
- Symbols
- Order of sorting
- Aesthetics
- Cultural values and social context
Practice
The current prevailing practice is for applications to place text in resource strings which are loaded during program execution as needed. These strings, stored in resource files, are relatively easy to translate. Programs are often built to reference resource libraries depending on the selected locale data. One software library that aids this is gettext.
Thus to get an application to support multiple languages one would design the application to select the relevant language resource file at runtime. Resource files are translated to the required languages. This method tends to be application-specific and at best, vendor-specific. The code required to manage date entry verification and many other locale-sensitive data types also must support differing locale requirements. Modern development systems and operating systems include sophisticated libraries for international support of these types.
New methods are evolving all the time to handle these complex issues. One such method, known as NLSO or Natural Language Support Objects uses databases to store resource strings. Another approach is the elimination of all references to culture, politics, history, etc.; avoidance of images (especially text embedded in images); and use of a controlled language. NLSO is available in open source and commercial software.
Difficulties
While translating existing text to other languages may seem easy, it is more difficult to maintain the parallel versions of texts throughout the life of the product. For instance, if a message displayed to the user is modified, all of the translated versions must be changed. This in turn results in somewhat longer development cycle.
Many localization issues (e.g. writing direction, text sorting) require more profound changes in the software than text translation. OpenOffice.Org achieves this with compilation switches.
To some degree (e.g. for Quality assurance), the development team needs someone who understands foreign languages and cultures and has a technical background. In large societies with one dominant language/culture, it may be difficult to find such a person.
Globalized web site tests
These tests are the tests used by Ivan Gan [who?] to determine if a web site is truly Globalized
- Site uses Unicode?
- Auto browser language detection?
- User language selection?
- Right to left language support?
- Mixed script & writing direction support?
- Is the language independent of Geo-location?
- Is the Geo list multi-lingual? (e.g. is Israel shown in English & Hebrew, Beijing in Chinese & English?)
Test 5 is essential for Social network, Blogs, & other sites which may contain mixed content, though the browser is also in part responsible for this support
Tests 6 & 7 are required for classified advertising sites, for example where address lists cross international boundaries & may refer to cities normally written in non-Latin character set
Cost vs benefit tradeoff
In a commercial setting, the benefit from localization is access to more markets. Some argue that the commercial case to localize products into multiple languages is very obvious, and that all is needed is a budgetary commitment from the producer to finance the considerable costs. It costs more to produce products for international markets, but in an increasingly global economy, supporting only one language/market is scarcely an option. Still, most proprietary software is only available in languages considered to be economically viable[citation needed].
Since open source software can generally be freely modified and redistributed, it is more prone to internationalization. The KDE project, for example, has been translated into over 100 languages[3].
See also
- Bidirectional script support
- CJK
- Globalization Management System
- Glocalization
- International Components for Unicode
- Input method editor
- Separation of concerns
- Region code
- Language localization
- Game localization
- Computer russification, localization into Russian language
- Language code
- Pseudolocalization, a software testing method for testing a software product's readiness for localization.
- Punycode, translating Unicode into the character sets for network host names
Notes
- ^ IBM Globalization web site
- ^ Microsoft Globalization Step-by-Step Guide
- ^ For the current list see KDE.org
References
- .NET Internationalization: The Developer's Guide to Building Global Windows and Web Applications, Guy Smith-Ferrier, Addison-Wesley Professional, 7 August 2006. ISBN 0-321-34138-4
- A Practical Guide to Localization, Bert Esselink, John Benjamins Publishing, [2000]. ISBN 1-58811-006-0
- Lydia Ash: The Web Testing Companion: The Insider's Guide to Efficient and Effective Tests, Wiley, May 2, 2003. ISBN 0471430218
- Business Without Borders: A Strategic Guide to Global Marketing, Donald A. DePalma, Globa Vista Press [2004]. ISBN 978-0976516903