Reply to post: Re: But, but, but ...

The Year Of Linux On The Desktop – at last! Windows Subsystem for Linux 2 brings the Linux kernel into Windows

Kristian Walsh Silver badge

Re: But, but, but ...

No. Encoding support isn't script support. Linux applications still have problems with contextual (Arabic), bi-directional (Arabic, Hebrew) or combining scripts (Thai, or the Brahmic scripts used to write Indian languages), despite the existence of very good libraries for supporting these things. (As with everything else UI-related on Linux, the lack of an agreed standard toolkit means there's a patchwork, with some applications offering very good support and some being very, very bad)

Even with library support, developers can find themselves facing an uphill struggle, as these scripts break a lot of Latin-centric "rules" that they may have believed (particularly "one code-point is one glyph" and "glyphs are laid out in the same direction"). Brahmic scripts in particular break all of these rules: here's an example for the Devanagari script used to write the Hindi language among others - https://en.wikipedia.org/wiki/Devanagari#Compounds ...note how the following "vowel" moves around the preceding consonant*

(Despite their "foreign-ness" to Westerners, Chinese/Japanese characters are actually the easiest of the non-European writing systems to support, as they behave pretty much the same as Latin letters.)

__

* in practice, this re-arrangement is somewhat simplified by the font: the font contains a definition of a finite-state-automaton which selects a pre-composed "cluster" glyph for a given sequence of input codepoints. The Unicode standard defines the rules for these combinations, but they're usually implemented in the font to allow greater stylistic variation (similarly to how some fonts create ligatures for "c t" when used with English text)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon