-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationalize (unify if possible) retrieval of symbol names for error messages #65
Comments
I thought that symbols were required to be in the native encoding
(making CHAR(PRINTNAME(...)) the most reasonable option), but I cannot
find it documented anywhere. Is it really so?
|
I'm not sure. Thats the question. That version is thr most common as we see
in the counts provided by Luke. The question is whether all others can be
safely converted to that or of they are needed for some reason, and if so
a) what those possible reasons are, and b) if all cases where thr alternate
is needed it is actually being used.
…On Mon, Aug 28, 2023, 2:47 PM aitap ***@***.***> wrote:
I thought that symbols were required to be in the native encoding
(making CHAR(PRINTNAME(...)) the most reasonable option), but I cannot
find it documented anywhere. Is it really so?
—
Reply to this email directly, view it on GitHub
<#65 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAG53MJMUMQHWPVJF3IH2BLXXUGWVANCNFSM6AAAAAA4AR3DFE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
That is part of the question, i.e. is
|
I can work on checking this. I think it's possible using a combination of static call graph analysis and adding a check to On the other hand, nothing currently prevents the C code from creating a symbol in non-native encoding: #define R_NO_REMAP
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rdynload.h>
LibExtern Rboolean utf8locale;
SEXP ohno(void) {
SEXP sym = PROTECT(Rf_allocSExp(SYMSXP));
SET_PRINTNAME(sym,
utf8locale ? Rf_mkCharCE("\xff", CE_LATIN1)
: Rf_mkCharCE("\xc3\xbf", CE_UTF8)
);
SET_SYMVALUE(sym, R_UnboundValue);
SET_DDVAL(sym, 0);
UNPROTECT(1);
return sym;
} Depending on whether the current locale is UTF-8 or not, it breaks in different ways. |
R symbols begin their life by being allocated using
spatch --recursive-includes allocSExp.cocci --dir R-devel/src (Coccinelle speaks in unified diffs, and it indicates matches by marking Let's follow the functions that allocate symbols:
|
Symbols ought to be treated as immutable, but it's worth checking for uses of
The only two users are Searching for This was about encoding safety. What about other operations on The The two uses of
What about the majority of the |
A patch would be great. Happy to review it. |
Work in progress on R-devel, but involves a bit of yak shaving. The best way to present a symbol in an error message is Once this is done, I intend to implement the function |
Endorsed by Luke
The text was updated successfully, but these errors were encountered: