Function returns get muddled when an op calls two functions #26

TelosTelos · 2021-01-12T08:00:15Z

Consider the following source code:

def f(i): return i
x = f(1)+f(2)

This currently compiles to instructions that conclude with op add x __pyc_ret __pyc_ret, which you'll notice is setting x equal to something plus itself, whereas this clearly should instead be adding two different things together.

The problem is that the current code uses a single dummy variable __pyc_ret for all function returns, whereas an instruction like this that is trying to take into account values returned from two function calls needs these returned values not to have been stored in the same variable, with the second return overwriting the first.

A general (slightly overkill, slightly inefficient) solution would be to always copy the returned value into a unique dummy variable right when it is returned and then to use that rather than __pyc_ret. A bit more efficiently (in instruction/cycle count, but much harder for us to write), we could try to detect cases where a collision would occur, and only add just as many dummy variables as it takes to avoid the collision. Detecting these cases gets even thornier once we allow complex expressions: e.g., x= f(1)+0+f(2) is perfectly fine, whereas x = 0+f(1)+f(2) requires that one of the returned values be cached in another variable to get the right answer.

It might help somewhat to give each function its own unique variable to return values in, e.g. f() gets __f_return_value and g() gets __g_return_value. This would keep g's returns from overwriting f's and vice versa. But this doesn't help in the case where f gets called twice, nor in the case where g calls f again before we got around to using f's last returned value... Quite a mess!

The text was updated successfully, but these errors were encountered:

TelosTelos · 2021-01-12T08:38:06Z

Maybe cases of overkill wouldn't be that hard to detect in an optimization pass. First compile with the overkill method above that copies every function return to a unique temp variable. Then on an optimization pass, look for sequences of the form "temp22 = return_variable_for_function_f" then later "any op involving temp22" with no other uses of temp22, and no intervening calls to function f (or to anything else that might have jumped to f). In any such sequence, temp22 was an unnecessary middleman, so we could cut it out of the process and thereby trim an instruction. Depending on how closely you inspect intervening jumps, this might not catch quite all cases of overkill (e.g., some jumps might appear suspicious when actually they wouldn't have altered f's return value). But this would give the "optimal" result in many cases (though not nearly so optional as in-lining functions would be!) and only fall one instruction short in the rest, which isn't bad.

Lonami · 2021-01-12T19:12:16Z

Yeah the more things we uncover the more I feel the optimization pass machinery will have to be setup. Much like there's now a _Jump instruction with a specific field for the label to jump to, which might want specifics _Set and _Op instructions to easily analyze their parameters in the optimization pass (otherwise we need to parse mlog to find out the variable names and changing them gets messier).

TelosTelos · 2021-01-13T14:34:37Z

My approach to complex expressions automatically incorporated the "overkill" solution to this problem. So now at least we get code that performs as expected, though as mentioned above, an optimization pass could often trim one instruction per call. Since functions are far from optimized in other ways, this bit of optimization may not be all that high a priority.

TelosTelos mentioned this issue Jan 13, 2021

Supporting complex expressions like (1+2)+3 #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function returns get muddled when an op calls two functions #26

Function returns get muddled when an op calls two functions #26

TelosTelos commented Jan 12, 2021

TelosTelos commented Jan 12, 2021

Lonami commented Jan 12, 2021

TelosTelos commented Jan 13, 2021

Function returns get muddled when an op calls two functions #26

Function returns get muddled when an op calls two functions #26

Comments

TelosTelos commented Jan 12, 2021

TelosTelos commented Jan 12, 2021

Lonami commented Jan 12, 2021

TelosTelos commented Jan 13, 2021