John Carter
John Carter

Reputation: 6985

Pitfalls when performing internationalization / localization with numbers?

When developing an application that will need to work with a variety of localizations, particularly with "right to left" text, is there a possibility of a case where numbers would need to be converted to "right to left" as well?

I'm no language scholar, but I know the RTL languages I am familiar with present their numbers in LTR.

For instance (using google translate):

I have 345 apples.

In Arabic:

لدي 345 التفاح.

So, I have two questions:

  1. Is it possible to run into a language that uses RTL numbers?
  2. How should internationalizing be handled in such cases?

or,

Is the "accepted norm" to just do numbers using Western Arabic characters, read from left to right?

Upvotes: 0

Views: 310

Answers (2)

Amir E. Aharoni
Amir E. Aharoni

Reputation: 1318

In the big right-to-left scripts - Arabic, Hebrew and Thaana - numbers always run left to right. (When I say "Arabic", I refer to all the languages that are written in the Arabic script - Arabic, Farsi, Urdu, Pasto and many others.)

Hebrew and Thaana always use European digits, the same 0-9 set as English. There's nothing much to do there, because Unicode automatically takes care of ordering the numbers correctly. But see the comments about isolation below.

It's possible to use European digits in Arabic, too; for example, the Arabic Wikipedia uses them. However, very frequently Arabic texts use a different set of digits - https://en.wikipedia.org/wiki/Eastern_Arabic_numerals . It depends on your users' preferences. Notice also, that in the Persian language the digits are slightly different. From the point of view of right-to-left layout they behave pretty much the same way as European digits, although there are slight differences in the behavior of mathematical signs - for example, the minus can go on the other side. There are some subtleties here, but they are mostly edge cases.

In both Hebrew and Arabic you may run into a problem with bidi-isolation. For example, if you have a Hebrew paragraph in which you have an English word, and after the word you have numbers, the numbers will appear to the right of the word, although you may have wanted them to appear on the left. That's how the Unicode bidi algorithm works by default. To resolve such things you can use the Unicode control characters RLM and LRM. If you are using HTML5, you can also use the <bdi> tag for this, as well as the CSS rule "unicode-bidi: isolate". These CSS and HTML5 solutions are quite powerful and elegant, but aren't supported in all browsers yet.

I am aware of one script in which the digits run right-to-left: N'Ko, which is used for some languages of Africa. I actually saw websites written in it, but it is far less common than Hebrew and Arabic.

Finally, if you're using JavaScript, you can use the free jquery.i18n library for automatic number conversion. See https://github.com/wikimedia/jquery.i18n . (Disclaimer: I am one of this library's developers.)

Upvotes: 1

Davhed
Davhed

Reputation: 76

Numbers will generally translate as you have them. Even in languages that read in different directions the Western Arabic numbers are typically recognized by the user.

Upvotes: 1

Related Questions