-
Notifications
You must be signed in to change notification settings - Fork 212
Add experimental compilation options tuners in Python #580
base: master
Are you sure you want to change the base?
Conversation
3cacf60
to
caf5130
Compare
This function might be useful in the future for compilation options search. One of the available options allows us to set the maximum shared memory used.
This function gives the time of execution of a given kernel with an input and specified options. Useful for benchmarking purposes.
Allow the user to set manually the new option privateDepth.
Given a CudaMappingOptions instance, return a Python dictionary that represents all the given options and their values.
caf5130
to
978b4de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good until 58caf8a (I've reviewed that code before in different PRs).
This review is on 58caf8a only. Consider calling your file utils.py
instead of my_utils.py
, it's a shared repository ;) Or chose another meaningful name.
Generally, I advise against using a flow based on global variables. From this commit alone it is not possible to see who else uses them and whether this use is consistent, so you probably want to split your PR into commits differently.
There are several places where sequential positions in a feature vector get translated to mapping options and back. It would be nice to have that localized in a single place, like a special bidirectional map stored somewhere. This would avoid problems when you want to change the list and forget one of the "translation" places.
I guess my last question is to @nicolasvasilache , what is the purpose if this code? Is it a one-off throw-away experiment? Do we expect to reuse it? Knowing this would help me adjust the expected code quality.
|
||
def get_convolution_example(size_type="default", inp_sz_list=[]): | ||
global INIT_INPUT_SZ | ||
INIT_INPUT_SZ = 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it assigned 7
in line 10 anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but the idea is to have other types of examples. And so the INIT_INPUT_SZ would change.
print(options.tolist()) | ||
|
||
def set_tc(tc_code_arg, tc_name_arg): | ||
global tc_code, tc_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see these variables declared in this file. Also, do you really want to hide global variable assignment behind a function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's the (almost the) simplest way to do.
One a bit simpler way would be to call these two functions in the example generation function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global variables are never a good idea. Especially if you want to run stuff in parallel eventually. I suggest creating a simple class Config
, putting these variables inside and passing instances of this class around. The flow of data is clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global variables are gone. Thank you for the advice!
tc_name = tc_name_arg | ||
|
||
def set_inp(inp_arg): | ||
global inp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem
inp = inp_arg | ||
|
||
def set_vars(tc_prog_arg, inp_arg, cat_val_arg, cat_sz_arg): | ||
global tc_prog, inp, cat_val, cat_sz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem
def catVec_to_optVec(catVec): | ||
global cat_val | ||
opt = [cat_val[i][catVec[i]] for i in range(NB_HYPERPARAMS)] | ||
#if(opt[18] * opt[17] > 1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we prefer not to commit commented-out code, especially full of magic constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is obsolete, but it was supposed to help handling the constraints on blocks and threads.
I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's obsolete, definitely not commit it
def getAllDivs(inp, maxp2=8): | ||
p2 = [] | ||
pp=1 | ||
for i in range(maxp2+1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a list comprehension instead of the loop and mutable temporaries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
for sz in elem.shape: | ||
l += computeDivs(sz) | ||
my_list = list(set(l+p2)) | ||
return np.sort(my_list).tolist() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like a standard sorted(my_list)
would have worked just fine here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
return np.sort(my_list).tolist() | ||
|
||
def computeCat(inp_arg): | ||
global cat_sz, cat_val, inp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these global variables are not defined in this file
cat_val = [] | ||
|
||
divs = getAllDivs(inp) | ||
#divs2 = getAllDivs([np.array([tc.tclib.shared_memory_size()])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented-out code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uncommenting this line and one below has the effect of enabling maxsharedoption.
I could certainly add a facultative (global) boolean variable in the initialization.
divs = getAllDivs(inp) | ||
#divs2 = getAllDivs([np.array([tc.tclib.shared_memory_size()])]) | ||
|
||
cat_val.append([0,1,2]) #0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the comments in this function relate the index of the respective value in the vector of tunable parameters. Would be nice to have at least a comment about that.
More generally, I would suggest having a bi-directional mapping between indices and names in one place and just using it everywhere else.
I find this much more readable
allowed_values = [[] for i in range(...)]
allowed_values[MapOptIdx.nTiledDims] = [i + 1 for i in range(6)]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will work on that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. What do you think?
Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours has expired. Before we can review or merge your code, we need you to email [email protected] with your details so we can update your status. |
978b4de
to
5aa8ad7
Compare
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
034a112
to
dc73868
Compare
Also introduce a new folder where will be all the files related to experimentingnew ways of tuning options.
generate_genetic_options will save a file containg the best found options by the genetic algorithm. eval_genetic_options would benchmark these options from the file.
These options must be given in their vector format. See my_utils for more info
Generate a graph of the performance of a random search for a given kernel and inputs sizes.
Generates a set of random options and tries to predict the execution time.
This algorithm uses a variant of experience replay, and one policy per option to predict.
Generate options using UCT algorithm. Useful for benchmarking.
dc73868
to
90baabe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trying to unsubscribe, don't see a way other than approving
These commits implement more functions in the python binders,
and also add different options tuning experiments in Python.