Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function returns get muddled when an op calls two functions #26

Open
TelosTelos opened this issue Jan 12, 2021 · 3 comments
Open

Function returns get muddled when an op calls two functions #26

TelosTelos opened this issue Jan 12, 2021 · 3 comments

Comments

@TelosTelos
Copy link
Collaborator

Consider the following source code:

def f(i): return i
x = f(1)+f(2)

This currently compiles to instructions that conclude with op add x __pyc_ret __pyc_ret, which you'll notice is setting x equal to something plus itself, whereas this clearly should instead be adding two different things together.

The problem is that the current code uses a single dummy variable __pyc_ret for all function returns, whereas an instruction like this that is trying to take into account values returned from two function calls needs these returned values not to have been stored in the same variable, with the second return overwriting the first.

A general (slightly overkill, slightly inefficient) solution would be to always copy the returned value into a unique dummy variable right when it is returned and then to use that rather than __pyc_ret. A bit more efficiently (in instruction/cycle count, but much harder for us to write), we could try to detect cases where a collision would occur, and only add just as many dummy variables as it takes to avoid the collision. Detecting these cases gets even thornier once we allow complex expressions: e.g., x= f(1)+0+f(2) is perfectly fine, whereas x = 0+f(1)+f(2) requires that one of the returned values be cached in another variable to get the right answer.

It might help somewhat to give each function its own unique variable to return values in, e.g. f() gets __f_return_value and g() gets __g_return_value. This would keep g's returns from overwriting f's and vice versa. But this doesn't help in the case where f gets called twice, nor in the case where g calls f again before we got around to using f's last returned value... Quite a mess!

@TelosTelos
Copy link
Collaborator Author

Maybe cases of overkill wouldn't be that hard to detect in an optimization pass. First compile with the overkill method above that copies every function return to a unique temp variable. Then on an optimization pass, look for sequences of the form "temp22 = return_variable_for_function_f" then later "any op involving temp22" with no other uses of temp22, and no intervening calls to function f (or to anything else that might have jumped to f). In any such sequence, temp22 was an unnecessary middleman, so we could cut it out of the process and thereby trim an instruction. Depending on how closely you inspect intervening jumps, this might not catch quite all cases of overkill (e.g., some jumps might appear suspicious when actually they wouldn't have altered f's return value). But this would give the "optimal" result in many cases (though not nearly so optional as in-lining functions would be!) and only fall one instruction short in the rest, which isn't bad.

@Lonami
Copy link
Owner

Lonami commented Jan 12, 2021

Yeah the more things we uncover the more I feel the optimization pass machinery will have to be setup. Much like there's now a _Jump instruction with a specific field for the label to jump to, which might want specifics _Set and _Op instructions to easily analyze their parameters in the optimization pass (otherwise we need to parse mlog to find out the variable names and changing them gets messier).

@TelosTelos
Copy link
Collaborator Author

My approach to complex expressions automatically incorporated the "overkill" solution to this problem. So now at least we get code that performs as expected, though as mentioned above, an optimization pass could often trim one instruction per call. Since functions are far from optimized in other ways, this bit of optimization may not be all that high a priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants