You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did a bit of analysis based on the Faster CPython team's current benchmarking results. From this it's clear that for many benchmarks we run more times than we need to in order to get a consistent result. Reducing the number of processes that are spawned for some of them would yield back a large fraction of the overall runtime. In most cases this runtime is multiplied by 2 because you need to benchmark both a head and a base commit.
The easy part is to just add processes= to the Runner constructor to individual benchmarks based on this analysis. But we also want to continuously confirm that that analysis remains correct. Obviously if a benchmark (or its dependency) changes, that invalidates all of that analysis, but that happens fairly infrequently and we try to revalidate the benchmarks when we do that. It's more likely that a change in the Python runtime might make a benchmark more or less stable, so we need to automatically detect for that.
Therefore, I propose:
Adding a new message for when it runs too many times:
Benchmark ran more times than was necessary to obtain a consistent result. Consider passing processes=N to the Runner constructor.
Adding a message for when it runs too few times. This could probably piggy back on the existing message that warns about a high standard deviation, just with a slightly different calculation for when it would be displayed. We can't really determine how many additional loops are needed, so the existing advice there of "Try to rerun the benchmark with more runs, values and/or loops" wouldn't change.
Does this make sense to others, particularly @vstinner who worked on the dynamic loop determination stuff in the past (which is related, but not the same -- everything here is related to the outermost process-spawning loop).
The text was updated successfully, but these errors were encountered:
I did a bit of analysis based on the Faster CPython team's current benchmarking results. From this it's clear that for many benchmarks we run more times than we need to in order to get a consistent result. Reducing the number of processes that are spawned for some of them would yield back a large fraction of the overall runtime. In most cases this runtime is multiplied by 2 because you need to benchmark both a head and a base commit.
The easy part is to just add
processes=
to theRunner
constructor to individual benchmarks based on this analysis. But we also want to continuously confirm that that analysis remains correct. Obviously if a benchmark (or its dependency) changes, that invalidates all of that analysis, but that happens fairly infrequently and we try to revalidate the benchmarks when we do that. It's more likely that a change in the Python runtime might make a benchmark more or less stable, so we need to automatically detect for that.Therefore, I propose:
Does this make sense to others, particularly @vstinner who worked on the dynamic loop determination stuff in the past (which is related, but not the same -- everything here is related to the outermost process-spawning loop).
The text was updated successfully, but these errors were encountered: