Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language detection is very strict in beta #143

Open
driehuis opened this issue Oct 14, 2021 · 3 comments
Open

Language detection is very strict in beta #143

driehuis opened this issue Oct 14, 2021 · 3 comments

Comments

@driehuis
Copy link

When looking at Dinish-Regular v2.009 in both production and beta versions if the Wakamai Fondue, I noticed that in the beta a bunch of languages disappeared from the list:

Release: 27 languages: Afrikaans, Albanian, Basque, Catalan, Danish, Dutch, English, Estonian, Faroese, Filipino, Finnish, French, Galician, German, Icelandic, Indonesian, Irish, Italian, Malay, Norwegian Bokmål, Polish, Portuguese, Spanish, Swahili, Swedish, Turkish, Zulu

Beta: Estonian, Irish, Italian, Norwegian Bokmål, Swahili and Zulu

The release in question can be downloaded from https://github.com/playbeing/dinish/releases/download/v2.009/dinish-otf.zip) to reproduce the issue.

I have since checked DINish against the list from https://r12a.github.io/app-charuse/ and discovered that the missing glyphs were the hyphentwo[2010], minute[2032] and second[2033] Unicode characters, and I don't believe that these are sorely missed when rendering Dutch. After I added these glyphs (and a ton of others), the language support for Dutch was shown in both production and beta Wakamai Fondue.

It doesn't feel right to be that strict. Even English had dropped off the list (the mother tongue of the people that created the US-ASCII character set). I'm not debating the correctness of the result, and I realize just what kind of rabbit hole you go down if you want to qualify the results ("Dutch would be fully supported if you add hyphentwo, minute and second glyphs"), but it may reduce the Beta's appeal for casual font users.

It would be cool if Wakamai Fondue could show which characters are unsupported for any given language, but that sounds like a lot of work.

Maybe just a blurb about how the detection works? "Language coverage is determined by checking against [definition X]. Sometimes, minor issues such missing rarely used punctuation characters can cause a language to not show up as supported".

@RoelN
Copy link
Contributor

RoelN commented Aug 26, 2022

I tried to make the language detection better for the beta, but I might have made it worse >_<

I've come to realise you can never reliably say "this font supports that language". What about loanwords, foreign names, historical characters, etc.

So instead of trying to determine for you which language support a font has, I want to give an indication about the support for a script. Then you can determine for yourself if that's adequate for your purposes.

I'd like to use something like https://github.com/bramstein/detect-writing-script for this.

@driehuis
Copy link
Author

I think language support should remain separate from script support. There is value in knowing if the Turkish s-cedilla or the Romanian comma accents are supported. And of course, one can't properly render French without guillemets, so punctuation can't be left out of scope completely.

You're right about loan words etc. I'd advocate to exclude them from consideration. If you want a Vietnamese name to be rendered correctly in a Dutch text, just pick a font that supports both languages :-)

There's no easy solution I'm afraid.

@RoelN
Copy link
Contributor

RoelN commented Aug 26, 2022

And there's even more complexity in this related issue! #29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants