You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's present in the Go implementation and its documentation too, and I think it's a reasonable function. But when the spec in this repo was derived from the spec I wrote for the Go implementation, tricky parts were sadly just thrown out rather than flagged as differences.
In this particular case, the feature is tricky to implement in the Java implementation: it's technically easy to expose java.lang.String's codepoint iterator, but doing so would reveal that all Starlark strings in Bazel are actually mis-decoded (UTF-8 files and directories are decoded as Latin1). Hiding the codepoints iterator (and also the chr function and "%c" format) makes it impossible for Bazel users to notice.
This is an unfortunate case where an old mistake in Bazel (mine, BTW) makes it hard to "do the right thing" in Starlark.
Would the right thing to be to add it to Starlark, but make a caveat that the Java implementation isn't compliant here? If historical mistakes in Bazel prevent progress on Starlark there might be a number of issues where we are unable to make forward progress.
Yes, I think that's appropriate. We could even add the method to Java, but make it fail dynamically subject to a flag that is disabled in Bazel as long as Bazel suffers from bazelbuild/bazel#374 and related bugs.
The spec at https://github.com/google/skylark/blob/3705afa472e466b8b061cce44b47c9ddc6db696d/doc/spec.md#string%C2%B7codepoints has a function codepoints, which gives the string as a series of points. That seems pretty reasonable to provide. Why did it disappear? The Rust Starlark implementation provides it, based on the old spec.
The text was updated successfully, but these errors were encountered: