Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalizing verbose messages in bmerge.R #6554

Open
rikivillalba opened this issue Sep 30, 2024 · 4 comments
Open

Normalizing verbose messages in bmerge.R #6554

rikivillalba opened this issue Sep 30, 2024 · 4 comments
Labels
translation issues/PRs related to message translation projects

Comments

@rikivillalba
Copy link
Contributor

rikivillalba commented Sep 30, 2024

Hello.
These are a some observations in verbose messages in bmerge.R I wish to share.

  • Some fragmented messages i.e. a " (which contains no fractions)" could be unified.
  • The class/data types are parameterized arguments in some cases or have been harcoded in others, leading to inconsistency if someone uses to translate the data types.
  • Hope if we parameterize some message the number of translatable strings may be reduced in two or three lines!

Here is the code with example proposals:

@@ -56,7 +56,7 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos
         next
       } else {
         if (xclass=="character") {
-          if (verbose) catf("Coercing factor column %s to type character to match type of %s.\n", iname, xname)
+          if (verbose) catf("Coercing %s column %s to type %s to match type of %s.\n", "factor", iname, "character", xname)
           set(i, j=ic, value=val<-as.character(i[[ic]]))
           set(callersi, j=ic, value=val)  # factor in i joining to character in x will return character and not keep x's factor; e.g. for antaresRead #3581
           next
@@ -93,26 +93,30 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos
       nm = c(iname, xname)
       if (xclass=="integer64") { w=i; wc=ic; wclass=iclass; } else { w=x; wc=xc; wclass=xclass; nm=rev(nm) }  # w is which to coerce
       if (wclass=="integer" || (wclass=="double" && !isReallyReal(w[[wc]]))) {
-        if (verbose) catf("Coercing %s column %s%s to type integer64 to match type of %s.\n", wclass, nm[1L], if (wclass=="double") " (which contains no fractions)" else "", nm[2L])
+        if (verbose)
+          if (wclass=="double")
+            catf("Coercing %s column %s (which contains no fractions) to type %s to match type of %s.\n", wclass, nm[1L], "integer64", nm[2L])
+          else
+            catf("Coercing %s column %s to type %s to match type of %s.\n", wclass, nm[1L], "integer64", nm[2L])
         set(w, j=wc, value=bit64::as.integer64(w[[wc]]))
-      } else stopf("Incompatible join types: %s is type integer64 but %s is type double and contains fractions", nm[2L], nm[1L])
+      } else stopf("Incompatible join types: %s is type %s but %s is type %s and contains fractions", nm[2L], "integer64", nm[1L], "double")
     } else {
       # just integer and double left
       if (iclass=="double") {
         if (!isReallyReal(i[[ic]])) {
           # common case of ad hoc user-typed integers missing L postfix joining to correct integer keys
           # we've always coerced to int and returned int, for convenience.
-          if (verbose) catf("Coercing double column %s (which contains no fractions) to type integer to match type of %s.\n", iname, xname)
+          if (verbose) catf("Coercing %s column %s (which contains no fractions) to type %s to match type of %s.\n", iclass, iname, "integer", xname)
           val = as.integer(i[[ic]])
           if (!is.null(attributes(i[[ic]]))) attributes(val) = attributes(i[[ic]])  # to retain Date for example; 3679
           set(i, j=ic, value=val)
           set(callersi, j=ic, value=val)       # change the shallow copy of i up in [.data.table to reflect in the result, too.
         } else {
-          if (verbose) catf("Coercing integer column %s to type double to match type of %s which contains fractions.\n", xname, iname)
+          if (verbose) catf("Coercing %s column %s to type %s to match type of %s which contains fractions.\n", xclass, xname, "double", iname)
           set(x, j=xc, value=as.double(x[[xc]]))
         }
       } else {
-        if (verbose) catf("Coercing integer column %s to type double for join to match type of %s.\n", iname, xname)
+        if (verbose) catf("Coercing %s column %s to type %s for join to match type of %s.\n", iclass, iname, "double", xname)
         set(i, j=ic, value=as.double(i[[ic]]))
       }
     }
@rikivillalba rikivillalba added the translation issues/PRs related to message translation projects label Sep 30, 2024
@aitap
Copy link
Contributor

aitap commented Oct 1, 2024

If leaving, e.g., factor and double untranslated, perhaps it might be better to use "Coercing '%s' column %s to type '%s' to match type of %s.\n", with extra quotes. Otherwise it might look a bit jarring in languages with non-Latin scripts.

@MichaelChirico
Copy link
Member

There is some improvement here after #6603. The only fragment-ish piece left is this one:

if (verbose) catf("Coercing %s column %s%s to type integer64 to match type of %s.\n", wclass, nm[1L], if (wclass=="double") " (which contains no fractions)" else "", nm[2L])

How doable is a translation with only one optional parenthesized clause being hidden in %s?

@rikivillalba
Copy link
Contributor Author

I think %s%s is ok, at least in spanish. Only that the parentherized explanation should be enclosed in gettext()
I cite the final version here:

if (verbose) catf("Coercing %s column %s%s to type integer64 to match type of %s.\n", wclass, nm[1L], if (wclass=="double") " (which has integer64 representation, e.g. no fractions)" else "", nm[2L])

Perhaps for aesthetics, all "coercing" messages could be pushed down to coerce_col, and only pass the required clarifications at the end of the message as a parameter, to append at the end. This would reduce the translation need.

@rikivillalba
Copy link
Contributor Author

rikivillalba commented Dec 21, 2024

I pushed a branch in which the "coercing" messages are moved to coerce_col. Some notes:

  • Currently, tests 2297.01 and 2297.02 silently match a "for join" wording after "to type double" which appears to be ok but in my branch the wording come before the type, because of the format template. This and other tests were adapted although.
  • Coerce_col does the coercions, but some coercions on bmerge were done "inline". So I moved them to coerce_col. There is the case of 'factor' to 'character' in which attributes were not copied (ok as it prevents 'levels' to go to the character column), so that logic was added to coerce_col also.

If you think it is of value i'll PR the proposal.

Besides that think this issue is closed for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
translation issues/PRs related to message translation projects
Projects
None yet
Development

No branches or pull requests

3 participants