-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle non-UTF-8 CLI args #7
Conversation
@@ -67,6 +68,9 @@ format_arg_extract_err = \err -> | |||
InvalidValue reason -> | |||
"The value provided to $(option_display_name option) was not a valid $(option_type_name option): $(reason)" | |||
|
|||
InvalidUnicode -> | |||
"The value provided to $(option_display_name option) was not valid UTF-8." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The err is InvalidUnicode
which tactfully sidesteps the u8/u16 thing, but the message here says utf-8, should we "invalid unicode"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably call this InvalidUtf8
, actually. The idea is that an Arg
is either a U8
or a U16
list, but Str
can only be valid UTF-8. We try to convert either the U8
list or the U16
list to UTF-8, and if that fails this error is raised.
@@ -75,6 +79,9 @@ format_arg_extract_err = \err -> | |||
InvalidValue reason -> | |||
"The value provided to the '$(param |> .name)' parameter was not a valid $(param |> .type |> full_type_name): $(reason)." | |||
|
|||
InvalidUnicode -> | |||
"The value provided to the '$(param |> .name)' parameter was not valid UTF-8." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
## Convert an [Arg] to a list of bytes. | ||
to_bytes : Arg -> List U8 | ||
to_bytes = \@Arg arg -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used?
Your pickup on my PR was about these not being bytes for Windows. Not sure if that is relevant here, just making a note in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new Opt.bytes
function that converts Arg
to bytes if someone wants the underlying bytes from an Arg
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for Windows with its U16
codepoints instead of bytes, I convert to bytes which is a valid thing to do, it's just sidestepping the OS awareness
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly people should either use Opt.arg
for actual handling or Opt.str
for decoding to a proper string.
Properly handle the real encoding of command-line arguments which is:
In addition, provide
Param.bytes
/Opt.bytes
andParam.arg
/Opt.arg
to allow engagement with the proper encoding by Weaver users.