Hebrew and digits don't render ("Failed to get emoji face") #49

mattansb · 2024-10-28T06:51:51Z

It seems that AGG devices are unable to render Hebrew text with a numeral. When this is attempted, the following message appears in the console:

Failed to get emoji face: <seemingly random, non-reproducible number/string>

ragg version: 1.3.3
R version: 4.4.1
OS: Win 11

Here is some reproducible code (I'm using {ggplot2} here, but this is not specific to ggplot2, see tidyverse/ggplot2#6013 and rstudio/rstudio#15370):

library(ggplot2)

df <- data.frame(
  x = 1:3,
  y = 0,
  label = c("זה טקסט", 
            "this is text with number 1",
            "זה טקסט עם מספר 2")
)

p <- ggplot(df, aes(x, y)) + 
  geom_label(aes(label = label))

ggsave("plot.png", device = png) # works

ggsave("plot_agg.png", device = ragg::agg_png)
#> Saving 7 x 7 in image
#> Failed to get emoji face:  , 1763456416
#> Failed to get emoji face:  ��ֶ�, 1885696607
#> Failed to get emoji face:  ��ֶ�, -474210312
#> Failed to get emoji face:  ��ֶ�, -474210312
#> Failed to get emoji face:  [

These other devises produce the same pattern:

ggsave("plot.tiff", device = tiff)
ggsave("plot_agg.tiff", device = ragg::agg_tiff)

ggsave("plot.jpeg", device = jpeg)
ggsave("plot_agg.jpeg", device = ragg::agg_jpeg)

The text was updated successfully, but these errors were encountered:

mattansb · 2024-10-29T12:18:37Z

It seems this behavior is avoided if the string begins with Latin text + Latin text separates the non-Latin text and the number:

library(ggplot2)

df <- tibble::tibble(
  label = c(
    "هذا مكتوب باللغة العربية 1",
    "זה כתוב בעברית 2",
    "هي سنڌيءَ ۾ لکيو ويو آهي 3",
    "العربية English separates 4",
    "English before نڌيءَ 5",
    "English before עברית English separates 6"
  ),
  y = rev(seq_along(label))
)

ggplot(df) + 
  geom_label(
    aes(0, y, label = label)
  )

oriobr · 2024-12-08T09:13:08Z

If I understand correctly, this is a package https://github.com/r-lib/textshaping/tree/main related problem.
Because the warning appears there in the file https://github.com/r-lib/textshaping/blob/main/src/string_shape.cpp

thomasp85 · 2024-12-09T09:44:02Z

Thanks for digging into this. It is most likely a textshaping issue, though it baffles me somewhat

thomasp85 · 2024-12-09T09:45:02Z

With the just released version of textshaping I get better results (no warning and everything gets rendered but the digit is misplaced)

thomasp85 · 2024-12-09T10:46:56Z

This is due to some wrong expectations in the output of fribidi--I assumed that weak characters such as digits would take on whatever direction they were close to but that is not the case. Instead the shaping needs to take into account the overall direction of the text as deduced by fribidi.

This will take some effort to correct but I'm on it

oriobr · 2024-12-09T13:52:53Z

First, thank you for the great work you are doing.
I downloaded the new version but the results remained the same.
The warnings appear and only the last text is produced.

library(ggplot2)

df <- tibble::tibble(
  label = c(
    "هذا مكتوب باللغة العربية 1",
    "זה כתוב בעברית 2",
    "هي سنڌيءَ ۾ لکيو ويو آهي 3",
    "العربية English separates 4",
    "English before نڌيءَ 5",
    "English before עברית English separates 6"
  ),
  y = rev(seq_along(label))
)

ggplot(df) + 
  geom_label(
    aes(0, y, label = label)
  )

sessionInfo()
#> R version 4.4.1 (2024-06-14 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=Hebrew_Israel.utf8  LC_CTYPE=Hebrew_Israel.utf8   
#> [3] LC_MONETARY=Hebrew_Israel.utf8 LC_NUMERIC=C                  
#> [5] LC_TIME=Hebrew_Israel.utf8    
#> 
#> time zone: Asia/Jerusalem
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.37     fastmap_1.2.0     xfun_0.49         glue_1.8.0       
#>  [5] knitr_1.49        htmltools_0.5.8.1 rmarkdown_2.29    lifecycle_1.0.4  
#>  [9] cli_3.6.3.9001    reprex_2.1.1      withr_3.0.2       compiler_4.4.1   
#> [13] rstudioapi_0.17.1 tools_4.4.1       evaluate_1.0.1    yaml_2.3.10      
#> [17] rlang_1.1.4.9000  fs_1.6.5

^{Created on 2024-12-09 with reprex v2.1.1}

thomasp85 · 2024-12-09T15:26:21Z

Strange. However, I'm embarking on a larger internal change to the shaping engine that should hopefully resolve this completely

mattansb · 2024-12-11T10:54:51Z

Just to confirm that I get the same results as @oriobr with the new version of textshaping (0.4.1).

Thanks for looking into to the @thomasp85 !

thomasp85 · 2025-01-07T12:57:20Z

With the latest changes in main I now get:

When running the example from @oriobr.

Some notes on how it looks:
You will notice that the arrangement of glyphs in label 4 and 5 differs from how it is presented in the browser. I'm frankly surprised by the difference since textshaping is using the unicode bidi algorithm (as implemented by fribidi) and I would assume all others were doing that as well. The reason for the difference in 4) is that textshaping/fribidi detects the string as rtl (the bidi algorithm specifies that the first encountered strong character determines the global direction). 5) is a bit trickier as I think fribidi (and by extension textshaping) is doing the wrong thing. Weak types such as european numbers should take on the direction of the first strong preceding character (here arabic). I'm in contact with the fribidi developer about this

mattansb · 2025-01-07T13:14:50Z

Wow, thanks @thomasp85 for taking the time to work on this! I've actually learned so much just from trying to pin-point the bug and watching your iterations fixing the problem.
Thanks again!

thomasp85 · 2025-01-13T15:43:34Z

With the latest change (#60) 5) is now also correct

Moreover, I've now integrated unicodes test suite for bidi implementations (91707 test cases) and textshaping passes them all

mattansb mentioned this issue Oct 28, 2024

Hebrew and digits don't render (Failed to get emoji face) rstudio/rstudio#15370

Closed

4 tasks

thomasp85 transferred this issue from r-lib/ragg Dec 9, 2024

thomasp85 mentioned this issue Jan 7, 2025

Yet-another-shaping-rewirte #53

Merged

thomasp85 closed this as completed in #53 Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hebrew and digits don't render ("Failed to get emoji face") #49

Hebrew and digits don't render ("Failed to get emoji face") #49

mattansb commented Oct 28, 2024

mattansb commented Oct 29, 2024

oriobr commented Dec 8, 2024

thomasp85 commented Dec 9, 2024

thomasp85 commented Dec 9, 2024 •

edited

Loading

thomasp85 commented Dec 9, 2024

oriobr commented Dec 9, 2024

thomasp85 commented Dec 9, 2024

mattansb commented Dec 11, 2024

thomasp85 commented Jan 7, 2025

mattansb commented Jan 7, 2025

thomasp85 commented Jan 13, 2025

Hebrew and digits don't render ("Failed to get emoji face") #49

Hebrew and digits don't render ("Failed to get emoji face") #49

Comments

mattansb commented Oct 28, 2024

mattansb commented Oct 29, 2024

oriobr commented Dec 8, 2024

thomasp85 commented Dec 9, 2024

thomasp85 commented Dec 9, 2024 • edited Loading

thomasp85 commented Dec 9, 2024

oriobr commented Dec 9, 2024

thomasp85 commented Dec 9, 2024

mattansb commented Dec 11, 2024

thomasp85 commented Jan 7, 2025

mattansb commented Jan 7, 2025

thomasp85 commented Jan 13, 2025

thomasp85 commented Dec 9, 2024 •

edited

Loading