-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
f58 not printing correctly on POSIX
locale
#3199
Comments
It looks like setting
|
We should be falling back to "f" when #include <stdlib.h>
#include <stdio.h>
#include <locale.h>
int main (int ac, char **av)
{
setlocale (LC_ALL, "");
printf ("MB_CUR_MAX=%ld\n", MB_CUR_MAX);
} (Edit: forgot that programs have to call It doesn't make sense to me that there is a difference between |
grondo@asp:~/git/flux-core.git$ locale
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
grondo@asp:~/git/flux-core.git$ ./mb
MB_CUR_MAX=1
grondo@asp:~/git/flux-core.git$ LANG=C.UTF-8 ./mb
MB_CUR_MAX=6 |
As seen above the idea was to solve this generally and it seems to at least nominally work on my system (and on TOSS systems IIRC). So I think this is a machine specific failure or some unanticipated corner case. If you are able to reproduce on another distro, please let me know. O/w, if we can start with the result of Thanks! |
Good to know. Thanks for that explanation!
Here is the output from Cori. It is running SLES15, so I wonder if it would reproduce on openSUSE 15. I might be able to give that a shot later this week.
It looks to me like the test script is behaving sanely. Hmm. |
Interesting! I can't get this to reproduce on TOSS3, so there must be some corner case we're hitting on that machine. I think we may have to do some debugging on Cori to figure out precisely where things are going wrong. We could start with the following simple patch, just to double check that we're getting the behavior we expect when diff --git a/src/common/libutil/fluid.c b/src/common/libutil/fluid.c
index d9f5fb1..d10897a 100644
--- a/src/common/libutil/fluid.c
+++ b/src/common/libutil/fluid.c
@@ -192,8 +192,12 @@ static int fluid_f58_encode (char *buf, int bufsz, fluid_t id)
}
/* Use alternate "f" prefix if locale is not multibyte */
- if (!is_utf8_locale())
+ if (!is_utf8_locale()) {
prefix = f58_alt_prefix;
+ fprintf (stderr, "Locale not multibyte capable, using 'f' FLUID prefix\n");
+ }
+ else
+ fprintf (stderr, "Using multibyte FLUID prefix\n");
if (bufsz <= strlen (prefix) + 1) {
errno = EOVERFLOW; |
Ah, I think we are getting closer:
|
Thanks! The static inline int is_utf8_locale (void)
{
/* Assume if MB_CUR_MAX > 1, this locale can handle wide characters
* and therefore will properly handle UTF-8.
*/
if (MB_CUR_MAX > 1)
return 1;
return 0;
} So |
Ah, I had a thought. I wonder if the Python version on Cori is initializing the locale differently when LANG/LC_ALL are not set vs the Python on my test systems. What does Looks like I've been testing with 3.6.8 and 3.6.9. |
A test that occurs to me is to use
|
I think you are on to something.
I'll try with their Python3.6 installation. |
I wonder if PEP538 has something to do with it? As far as I can gather, the coercion was introduced into Python 3.7.x. |
Looks like it isn't an issue with their Python3.6 installation:
I wonder if it is just a configure flag, or a version change (e.g., the PEP538 that you mentioned). |
Nevermind, it looks like the UCS 2 vs 4 configure option was really only relevant for python2. |
Can you try with the Python 3.7 version and set |
I don't understand why Python is detecting |
It appears that the shell I have on Cori cannot handle unicode properly at all:
Versus in the same terminal emulator but running locally on my mac:
So at the risk of stating the obvious, I'm guessing the issue is that Python3.7 is coercing the locale to a UTF-8 compatible one, so that Flux thinks it is safe to emit ƒ, but the underlying terminal truly cannot handle Unicode (it should stay POSIX/C). Anyone know what might be preventing my shell on Cori from properly handling Unicode? Maybe I just need to forward my EDIT: looks like I already had locale forwarding on, Nersc blocks the forwarding of environment variables. But that's just a convenience thing I guess. Still not sure where the Unicode breakage is coming from. EDIT2: It appears that it was a tmux-related issue. I was setting the locale within tmux to be UTF8 but the shell that the tmux server was spawned it was POSIX. After I added:
to my |
Would it make sense for the |
That is a good idea. However, remember that this will change the environment for all flux jobs as well, which is why we backed off of setting environment variables like this before in the I'm honestly not sure the best way to proceed. |
From coffee call we are going to try out the following solution:
|
@SteVwonder, want to see if the following patch works? Seems to at least make diff --git a/src/bindings/python/flux/util.py b/src/bindings/python/flux/util.py
index 6cb1ef5..d467799 100644
--- a/src/bindings/python/flux/util.py
+++ b/src/bindings/python/flux/util.py
@@ -17,6 +17,7 @@ import os
import math
import argparse
import traceback
+import locale
from datetime import timedelta
from string import Formatter
@@ -127,6 +128,10 @@ class CLIMain(object):
self.logger = logger
def __call__(self, main_func):
+ _, encoding = locale.getdefaultlocale()
+ if encoding in [ None, "C", "POSIX" ]:
+ os.environ["FLUX_F58_FORCE_ASCII"] = "1"
+
loglevel = int(os.environ.get("FLUX_PYCLI_LOGLEVEL", logging.INFO))
logging.basicConfig(
level=loglevel, format="%(name)s: %(levelname)s: %(message)s"
diff --git a/src/common/libutil/fluid.c b/src/common/libutil/fluid.c
index d9f5fb1..fb3b6c2 100644
--- a/src/common/libutil/fluid.c
+++ b/src/common/libutil/fluid.c
@@ -174,7 +174,7 @@ static inline int is_utf8_locale (void)
/* Assume if MB_CUR_MAX > 1, this locale can handle wide characters
* and therefore will properly handle UTF-8.
*/
- if (MB_CUR_MAX > 1)
+ if (MB_CUR_MAX > 1 && !getenv ("FLUX_F58_FORCE_ASCII"))
return 1;
return 0;
} |
Unfortunately, if
So for all of the previous outputs of
I don't understand why
Note: I added some extra logging to the util.py main decorator for the above code snippets: diff --git a/src/bindings/python/flux/util.py b/src/bindings/python/flux/util.py [15/124]
index 6cb1ef5cd..a2602d83d 100644
--- a/src/bindings/python/flux/util.py
+++ b/src/bindings/python/flux/util.py
@@ -17,6 +17,7 @@ import os
import math
import argparse
import traceback
+import locale
from datetime import timedelta
from string import Formatter
@@ -127,10 +128,18 @@ class CLIMain(object):
self.logger = logger
def __call__(self, main_func):
+ _, encoding = locale.getdefaultlocale()
+ print("Detected default locale: {}".format(encoding))
+ if encoding in [ None, "C", "POSIX" ]:
+ print("Forcing the use of ASCII for F58")
+ os.environ["FLUX_F58_FORCE_ASCII"] = "1"
+ else:
+ print("Using autodetection for unicode support for F58")
loglevel = int(os.environ.get("FLUX_PYCLI_LOGLEVEL", logging.INFO))
logging.basicConfig(
level=loglevel, format="%(name)s: %(levelname)s: %(message)s"
)
+
exit_code = 0
try:
main_func() |
Some more evidence in support of that theory:
|
Given that the issue appears to be on systems with absolutely no locale information set whatsoever (which is arguably a user/system configuration issue, not a Flux issue), how does this sound: We document this problem and potential solutions in our FAQ over on flux-docs. In particular, the preferred solution would be for the user to set Thoughts? |
Thanks @SteVwonder! Your observations above were not what I observed on TOSS3 with Python 3.7.2 loaded. In that case
I tried on a Mac with 3.7.3, but unfortunately calling
It seems like you should be able to detect the underlying, default locale from Python. I don't understand why this isn't working. At the very least I would like to get the output of Python commands ( I wonder if we should just test the environment variables ourselves, or write our own |
Maybe this simpler approach will work to detect the specific troublesome circumstance described here: diff --git a/src/bindings/python/flux/util.py b/src/bindings/python/flux/util.py
index 6cb1ef5..298d06a 100644
--- a/src/bindings/python/flux/util.py
+++ b/src/bindings/python/flux/util.py
@@ -119,6 +119,13 @@ def help_formatter(argwidth=40):
return lambda prog: FluxHelpFormatter(prog, max_help_position=argwidth)
+def defaultlocale():
+ for key in [ "LANG", "LC_ALL", "LC_CTYPE" ]:
+ if key in os.environ:
+ return os.environ[key]
+ return None
+
+
class CLIMain(object):
def __init__(self, logger=None):
if logger is None:
@@ -127,6 +134,10 @@ class CLIMain(object):
self.logger = logger
def __call__(self, main_func):
+ locale = defaultlocale()
+ if locale in [ None, "C", "POSIX" ]:
+ os.environ["FLUX_F58_FORCE_ASCII"] = "1"
+
loglevel = int(os.environ.get("FLUX_PYCLI_LOGLEVEL", logging.INFO))
logging.basicConfig(
level=loglevel, format="%(name)s: %(levelname)s: %(message)s"
diff --git a/src/common/libutil/fluid.c b/src/common/libutil/fluid.c
index d9f5fb1..fb3b6c2 100644
--- a/src/common/libutil/fluid.c
+++ b/src/common/libutil/fluid.c
@@ -174,7 +174,7 @@ static inline int is_utf8_locale (void)
/* Assume if MB_CUR_MAX > 1, this locale can handle wide characters
* and therefore will properly handle UTF-8.
*/
- if (MB_CUR_MAX > 1)
+ if (MB_CUR_MAX > 1 && !getenv ("FLUX_F58_FORCE_ASCII"))
return 1;
return 0;
} |
For future reference, on Python 3.7.2 I don't see any evidence that coercion of C locale to UTF-8 changes the environment of the Python process to set E.g. for 3.7.2:
|
Documented the FAQ workaround in flux-framework/flux-docs#61 |
For Python 3.7.4 from Anaconda on Cori:
|
Thanks! |
I'm running flux-core 0.19.0 on NERSC's Cori, and it is printing
_
rather thanf
orƒ
for the f58 prefix:The locale:
The text was updated successfully, but these errors were encountered: