Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of font support for Kashmiri characters makes text deviate from true semantics #249

Open
r12a opened this issue Mar 18, 2022 · 3 comments
Labels
doc:arab_ks gap The first comment in this issue is read by the gap-analysis document. i:encoding Characters & encoding i:fonts Fonts & font styles l:ks Kashmiri p:basic The gap-analysis priority is Basic. s:aran Arabic nastaliq script style x:webkit

Comments

@r12a
Copy link
Contributor

r12a commented Mar 18, 2022

This issue is applicable to Kashmiri written with the Perso-arabic script.

Kashmiri is written using the nastaliq style of Arabic writing. Although the Kashmiri orthography has some resemblance to that used for Urdu, to represent Kashmiri sounds it uses a number of unique characters or combinations.

The GAP

There are almost no fonts that properly support Kashmiri written in that orthography. (Noto Nastaliq Urdu was only updated in Feb 2022 to support Kashmiri.)

The result of this is that people resort to using inappropriate characters in their text so that the content looks visually more like they are expecting, and even then gaps remain. For example, to make the sukun look like an inverted v rather than a circle, users often use U+065B ARABIC VOWEL SIGN INVERTED SMALL V ABOVE, which is supposed to be used as an African vowel diacritic. There are several such problems in Kashmiri. Lists can be found here and here

Keyboards and input methods also need to be configured to insert the correct characters, but this doesn't help while there are so few fonts available that can display the characters.

This issue is not likely to be fixed by specifications or browser fixes, but does cause a significant constraint for Kashmiris wishing to use the Web.

There is an additional issue, however, related to pre-installed fonts on macOS (see below).

Priority

Clarifying and standardising the correct usage of characters to represent Kashmiri is a fundamental requirement for interoperable and unerstandable text, so this issue is given a priority of Basic.

Tests & results

interactive test, A given font will correctly render characters needed for Kashmiri in the perso-arabic script.

The glyph shapes when the text in the test are displayed should resemble those in the image just below. In particular: farsi yeh with small v above should join to the left; the 4 forms of kashmiri yeh should appear; hamzas should use the round form; the sukun over PA should be an inverted v.

Screenshot 2022-03-22 at 12 43 33

As of March 2022, the latest version of Noto Nastaliq Urdu supports the needed glyphs, if the language is set to 'ks', and displays correctly on Windows10. However, on macOS 12.2.1 the pre-installed version of the font cannot be overwritten and is used to display Kashmiri text in browsers, meaning that there is no support on macOS at the time of writing.

The SIL's Awami Nastaliq font succeeds in correctly rendering all but one feature: the hamza is s-shaped, as used for Urdu, rather than rounded. However, this is a Graphite font, and so only works currently on Gecko browsers.

The Gulmarg Nastaleeq font supports some features in Windows, but appears to not have glyphs for KASHMIRI YEH or for LETTER WAW WITH RING. It also doesn't work on macOS, presumably for the same reason as the Noto font.

Action taken

Webkit

Outcomes

Version 3.002 and above of Noto Nastaliq Urdu now supports all characters needed for Kashmiri, and will also automatically provide the correct shape for things such as the sukun diacritic if the language of the text is set to Kashmiri.

A Unicode submission was approved by the Unicode Technical Committee that says that a word-final half-yeh should not be written using U+06CD ARABIC LETTER YEH WITH TAIL.

@r12a r12a added gap The first comment in this issue is read by the gap-analysis document. doc:arfa labels Mar 18, 2022
@r12a
Copy link
Contributor Author

r12a commented Mar 18, 2022

The first comment in this issue contains text that will automatically appear in one or more gap-analysis documents as a subsection with the same title as this issue. Any edits made to that comment will be immediately available in the document. Proposals for changes or discussion of the content can be made in comments below this point.

Relevant gap analysis documents include:
Kashmiri

@r12a r12a added doc:arab_ks i:fonts Fonts & font styles p:basic The gap-analysis priority is Basic. and removed doc:arfa labels Mar 18, 2022
@xfq
Copy link
Member

xfq commented Mar 23, 2022

The link to the relevant gap analysis document is broken. I guess it's because we haven't published it yet?

(Same for #250.)

@r12a r12a added the x:webkit label Mar 23, 2022
@r12a
Copy link
Contributor Author

r12a commented Nov 6, 2023

Links fixed. Format updated. Added link to Unicode submission.

@r12a r12a added l:ks Kashmiri i:encoding Characters & encoding labels Jun 5, 2024
@r12a r12a moved this to Browser bug raised in Gap-analysis pipeline Jun 20, 2024
@r12a r12a added the s:aran Arabic nastaliq script style label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc:arab_ks gap The first comment in this issue is read by the gap-analysis document. i:encoding Characters & encoding i:fonts Fonts & font styles l:ks Kashmiri p:basic The gap-analysis priority is Basic. s:aran Arabic nastaliq script style x:webkit
Projects
Status: Browser bug raised
Development

No branches or pull requests

2 participants