Code review #2

jgonggrijp · 2021-08-06T16:11:38Z

The purpose of this PR is not to actually merge the changes, but to illustrate my comments with code. It is a work in progress. For most convenient consumption, I suggest reading the commits one by one.

This comment also applies to the generated parser itself, but I'm illustrating with the JSON example runner because all the variations are visible within a short section of code: (1) const/let, (2) var and (3) the absence of any such keyword. Mixing (1) and (2) is frowned upon; it makes the scope of variables unpredictable. (1) is newer, hence often considered better. In this case, however, I would suggest that you consistently go with (2), at least in the generated parser module. There are two reasons for my recommendation: 1. var has roughly the same scope semantics as local variables in Python. Since you are transpiling from Python, this makes it easier to generate reliable code. 2. var is older, hence portable to more different JS environments. (3), while allowed, should be avoided at all cost, as it always creates global variables (which is not the same as var at module scope). The fact that this form is allowed at all should be considered a historical artifact.

You can often get away without the semicolons, but there are corner cases where it can cause the JS interpreter/compiler to do the wrong thing, especially after inadvertent subsequent edits. It is generally considered better style to finish every statement with a semicolon AND a line end. This also speeds up JS parsing in some cases. This comment only applies to the example runner; it appears that the generated parser consistently includes the semicolons. Nevertheless, including them in your example code makes a better impression and also reduces the probability that users will accidentally break the code through editing.

JS users are used to dense code, with only single blank lines between blocks or occasionally inside long blocks. You can stretch this a bit by using double blank lines before classes and standalone functions like in PEP8, but you will have to be very consistent so that it is clear what convention you're maintaining. JS users will get irritated seeing a blank line before a closing semicolon or a blank line separating a single line of code from the rest of a function body.

@param

Using multiline strings as documentation is not a thing in JS; you are expected to always use comments, which should precede the thing they document. The style I've demonstrated in the commit could be broadly referred to as a literate commenting style, which is for example practiced in Underscore and related libraries. It can be read as-is together with the code, or it can be used to present the comments and the code side by side, as illustrated below. Needless to say, this necessitates writing a separate manual by hand (second link). https://underscorejs.org/docs/underscore-esm.html https://underscorejs.org A possible alternative style, which is very popular, is the jsdoc format: /** * Short description * Possibly longer explanation with examples * @param name description * @returns description */ This is more geared to automatically generating reference documentation that is separate from the code. In order for such separate documentation to be useful, the comments need to be quite extensive, which can render the code itself less readable. It should also be noted that the generated reference documentation is generally still not sufficient to completely replace a purpose-written manual. https://github.com/lodash/lodash/blob/c84fe82760fb2d3e03a63379b297a1cc1a2fce12/lodash.js#L459

Two-space indents are often considered a convention in JavaScript. Even in the JS community, however, there is sizable subpopulation that appreciates larger indents for better readability. Ultimately, it is important not to do what is most popular but what you believe is best. If you subscribe to PEP8's four-space indent recommendation, there is no reason not to apply it in JS as well. (The reason two-space indents became fashion in JS is also outdated. When JS programmers were still writing module wrappers by hand and there was no async/await syntax yet, indentation tended to go very deep very quickly. This produced an incentive to sacrifice readability in exchange for needing to wrap lines less often. There is no longer a need to make this sacrifice.)

JS users expect the word "polyfill" to mean something very specific, i.e., code that emulates a builtin feature of *JavaScript* if it is not available by an outdated target environment. Using the word differently might cause confusion.

When possible, avoid abbreviated names because they cause cognitive strain. Highly idiomatic single-letter variables such as i/j/k in a loop can be acceptable. An additional remark on this snippet: it appears that the re_ parameter is just an alias of the Python re module emulation a few lines up. Avoid this pattern if possible; using re directly will make the code much easier to understand.

JavaScript's modern iterator support is of course convenient for transpiling Python generators. Unfortunately, they do have a performance cost. for-of loops also work on plain arrays, so when it is possible to replace a generator function by a normal function that returns a plain array, it is likely that this will make your parser faster. for-of loops on arrays, in turn, should be replaced by old-fashioned for-;; loops for raw performance. The for-of construct involves converting the array to iterator first through the Array[@@iterator] method and then calling iterator#next on each loop iteration, inducing considerable function call overhead. Of course, these considerations only apply to hot loops. An added bonus of reducing iterators and for-of constructs is that compatibility transpilers like Babel will produce less boilerplate code.

Introducing a library dependency like Underscore allows you to bring back much of the notational elegance that Python brings through itertools and functools, plus some JavaScript-specific extras. Pros: - Shorter, more elegant, more maintainable code. - Removes the need for many of the generic functions that you have written in lines 87-333. - Completely portable; removes need for many constructs that cause compatibility transpilers to generate boilerplate code or introduce polyfills. Cons: - Performance penalty is similar to for-of due to function call overhead. Nothing beats the raw performance of bare for-;; loops. - Having a dependency is perceived as a burden in the JS community. Underscore is however sufficiently small that its added weight can be completely offset by the reduced code size, especially after compatibility transpilation. - No support for iterators (although this can be easily patched). This commit serves to illustrate how Underscore could be used. Some of the code changes might be better left out. In general, I recommend to use Underscore where possible and for-;; where necessary. For completeness, I will mention that I maintain Underscore and that there are alternatives such as Lodash and Ramda.

Please ignore this commit.

Could use some polishing.

jgonggrijp added 11 commits August 6, 2021 14:27

"Polyfill"

1348cb0

JS users expect the word "polyfill" to mean something very specific, i.e., code that emulates a builtin feature of *JavaScript* if it is not available by an outdated target environment. Using the word differently might cause confusion.

Remove trailing spaces

073fb4e

Please ignore this commit.

Whitespace around comment blocks

586aa7c

Could use some polishing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code review #2

Code review #2

jgonggrijp commented Aug 6, 2021

Code review #2

Are you sure you want to change the base?

Code review #2

Conversation

jgonggrijp commented Aug 6, 2021