Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name spaces #36

Open
TelosTelos opened this issue Jan 21, 2021 · 2 comments
Open

Name spaces #36

TelosTelos opened this issue Jan 21, 2021 · 2 comments

Comments

@TelosTelos
Copy link
Collaborator

We've talked elsewhere about potentially introducing some analog of Python's name spaces. Currently, we do this only for the args to out-of-line functions, giving them mangled names to keep them distinct from like-named variables elsewhere. All other variables used within functions are effectively treated as global.

In Python, each call of a function generates its own distinct namespace, so e.g, in recursive calls, each layer in the recursion can have different values for its (local) variables. To implement that, we'd need to use a stack and lots of push/pop operations, and would likely find that the cost in processor cycles is too steep for whatever minor benefits this might give.

If we set aside cases of recursion, we could get a decent approximation of Python namespaces by mangling any variable names that we want to treat as being "local" to some particular function. In Python, the global/local distinction is handled by explicit declaration, and by a rule that says that, in the absence of explicit guidance, every assignment/"write" output must be local, whereas all other (input/"read") uses of variables will go to the smallest scope in which they are defined. This means that same-looking variables may end accessing quite different memory addresses at different points in a function, and which they access may depend upon conditions that can't be known at compile-time.

def foo()
   print(b) # b is global
   if c: b = 0 # b may now be local, or may not, depending on c
   print(b)  # will access the local b, if we just made one, else the global b
   del b
   print(b) # now b is global again

I don't think there's any good way to do all this with name-mangling. We can handle references to global b by leaving b unmangled, and we can handle references to a quasi-local b by using a mangled version like __foo_b. But whichever one of these we pick for the middle print(b) above, it won't always give the same behavior as Python does. (That might be a good thing, since Python's way of doing this confuses and frustrates many novices.)

If we want to implement some version of this, we'll need a further simplification away from Python (in addition to the no-new-namespace-upon-recursion simplification). That then raises the question: what rules should we use to decide which variables count as "local" in the absence of explicit declaration? One fairly plausible candidate rule would be to say that any variable that the compiler hasn't seen be written to yet within the function will default to being global/unmangled, but starting with the first line where it gets written to (even within an if statement) it will then be local/mangled for the rest of the function. Another plausible candidate would say that each variable is either global-throughout-the-function or local-throughout-the-function -- no switching like you can have in Python -- and that being written to anywhere in a function (even within an if statement) makes a variable therefore count as local throughout the function. This latter option fits better with our "once type C, always type C" rule, but it would require mangling to be done on a later pass, not at initial compilation.

My own inclination here is to just say "Everything but function arguments is global, so be careful about using the same variable for different things!" But anyway, I've been reworking name-mangling to fit better with the new type-detection system, so now would be a good point to add some further approximation of name-spaces, if we want it? This may also make a difference to whether I'll do name-mangling in the first pass, which can be slightly more efficient, or wait to do mangled-substitutions later, as would be needed if we opted for something like the latter candidate rule above.

@TelosTelos
Copy link
Collaborator Author

Well, I've tentatively implemented this using the first candidate above: within functions each variable is treated as global until the first instance where it gets overwritten by some operation; from that point on it is then treated as "local" -- i.e. it thereafter is replaced with a mangled version of its name (regardless of whether that operation was inside an if clause). I still need to add handling for the explicit global declaration to veto automatic localization of set variables.

@Lonami
Copy link
Owner

Lonami commented Jan 24, 2021

I've said this a few times before but I'll repeat it here, now that it has a proper issue.

Python also has this problem with scopes, and if it detects "read global, set global", it will fail:

>>> x = 0
>>> def f():
...     x += 1
...
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
UnboundLocalError: local variable 'x' referenced before assignment

The correct solution is to declare it with global:

>>> def f():
...     global x
...     x += 1
...
>>> f()
>>> x
1

nonlocal also exists, and it's for nested functions. Pyndustric should do the same. This is the least surprising (and most sane) behaviour. No guessing.

Implementation wise it's trivial: everything in a function is namespaced and local to that function, unless there is a global in its body. Probably enough to scan the entire body for these global statements when we find a def and store those special names somewhere ("do not namespace these" set).

Note: this should be implemented in at least two commits (one for the scopes, another to support global).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants