-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implementation of sequential batch optimization #74
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #74 +/- ##
==========================================
- Coverage 99.8% 98.07% -1.73%
==========================================
Files 17 18 +1
Lines 1013 1041 +28
==========================================
+ Hits 1011 1021 +10
- Misses 2 20 +18
Continue to review full report at Codecov.
|
Great job, this is great to start from. I think we can do some reshaping to make these features more part of the framework and less specific to the qEI implementation which is otherwise great. I'll add some notes inline. For the troubles you encounter, 1-2 aren't something to be super worried about, 3 is concerning. This can usually be remedied by adding some priors on the hyperparameters (or some things we are discussing in #73 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets consider this first, afterwards we might even consider moving the qEI into the normal EI class?
Y = np.vstack((self.Y_,) + (self.fmin.value,)*len(args)) | ||
self.set_data(X, Y) | ||
|
||
def __enter__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two notes:
- I dislike the old style of doing contexts and prefer the contextmanager decorator (see https://jeffknupp.com/blog/2016/03/07/python-with-context-managers/ fun with contextlib).
- I think, with contextlib, this should be moved to Acquisition as the batch context is something which is potentially applicable to any acquisition implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the following in acquisition:
class SequentialBatch(object):
def __init__(acquisition, size):
self.acquisition = acquisition
self.size = size
def optimize(optimizer):
points = []
def inverse_acquisition(X):
....
for j in range(self.acquisition.batch_size):
result = optimizer.optimize(inverse_acquisition)
batch.append(result.x)
self.acquisition.set_batch(np.concatenate(points))
return np.concatenate(points)
...
class Acquisition(Parameterized):
....
def set_batch(self, batch_points):
pass # To be implemented in subclasses, e.g. qEI
@contextmanager
def batch(size):
X_orig, Y_orig = self.data
yield SequentialBatch(self, size)
self.set_batch(np.empty((0, self.domain.size)))
self.set_data(X_orig, Y_orig)
Then in BayesianOptimizer you could do:
if self.batch_size > 1:
with self.acquisition.batch(self.batch_size) as batch:
points = batch.optimize(self.optimizer)
self._update_model_data(points)
with some further thought I think you could even see the one point strategy here as a batch with size 1 here and simply do this by default in BayesianOptimizer. What about my proposal for set_batch? It seems intuitive to simply pass in an array which is the current batch. Then if you pass a 0 x dim it signals clearing. This way has some advantages I think
- The interface for adding batch support to an acquisition implementation is limited to implementing set_batch.
- if we'd ever have asynchronous batches, once a point was evaluate you could simply perform
set_batch(2d array \ evaluated point)
and optimize. The great thing is that given this angle, all that would be needed to expand towards such approach is another object which is yielded from the context, e.g. AsyncSequentialBatch or something. Just a motivation for the design in the snippet)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing I just realized, with the pseudocode of the previous reply, the logic of selecting the points relays to a different policy object (the SequentialBatch). It would be perfectly possible to add a TrueBatch
which duplicates the domain and selects the points at once. Swtiching between both approaches would simply require to instantiate BayesianOptimizer with a different object in its constructor. This would be a very simple and clean way to make both approaches co-exist if we merge this branch with the other batch approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, thanks for the very detailed response! Sorry the delay from my part; the past week was busy.
Having a batch
contextmanager
in Acquisition
assumes that the initialization and finilization for the batch is irrespective of the batch policy. I think this is restrictive, particularly due to the fact that it cannot be "overwritten" by the SequentialBatch
class. This is why I tried to add the context in the final class itself. Note that, to my understanding, this cannot be done with contextlib
decorators. On the other hand, I agree that set_batch
should be included in the base Acquisition
class as it can be overwritten by other child classes.
If we only plan to use the contextmanager
to call optimize
, as you suggest doing in BayesianOptimizer
:
with self.acquisition.batch(self.batch_size) as batch:
points = batch.optimize(self.optimizer)
then there is no need to use the context, at least in BayesianOptimizer
. The function batch.optimize() should be responsible for initializing/clean up (either with a help of a context internally or by itself).
But I do believe that a context
is useful. It gives us safe access to penalised/modified acquisition functions via set_batch
, which would be useful for researching the effect of the batch heuristics used.
I could add a function optimise
in the batch acquisition class. It would be something like:
def optimize(optimizer):
points = []
def inverse_acquisition(X):
....
# Use the context here
with self:
# Build the batch...
return points
and the BayesianOptimiser
would simply do:
if self.batch_size > 1:
# No need for context here, it is used inside acquisition.optimize
points = acquisition.optimize(self.optimizer)
Finally, this might be on the pedantic side, but I would call the function get_suggestion
instead of optimize
. I think optimise
is not necessarily descriptive of the process done and it would make me wonder why not directly use optimizer.optimize
.
Looking forward to hearing your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have discussed this today and considered some scenario's. I think we won't be able to solve everything in one PR unfortunately, we are proposing the following for now to get started which should be a nice compromise:
(1) do not implement a context for now
(2) Class wise, we'd like to go for the following
I favor your idea of naming (i.e. get_suggestion
) as optimize
would be confusing with GPflow, and confusing with optimizer.optimize
.
i.e., the get_suggestion
in sequential batch looks a lot like the sequential batch above and also contains the operation of the context currently. If a sequential strategy really diverts from the default it can override get_suggestion. Furthermore, the methods get_suggestion relies on go in the interface of the derived class (i.e. in SequentialBatchAcquisition, a set_batch is required, this method is missing from the other classes). In Acquisition get_suggestion
simply optimizes, and BayesianOptimizer calls acquisition.get_suggestion, it doesn't even need to verify batch size.
I think this is a good design to start with, its also a plain OO solution to solve this without extra complexity. We also added for instance an ABAcquisition, which could be used to implement for instance https://arxiv.org/pdf/1704.03651.pdf which isn't a true batch approach but it shares some semantics. Unfortunately, this currently isn't compatible with the aggregation mechanics and MCMC components we have in place currently. I'm still in the process of discussing this with @icouckuy but we should be able to resolve this later on it will however involve some reshuffling but right now I think its fully manageable.
What do you think. You could add SequentialBatchAcquisition, I'll do ParallelBatchAcquisition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think it's a good plan. I will try to update the PR accordingly for SequentialBatchAcquisition by early next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, to give you an idea: this is where we are heading at the moment:
still discussion ongoing though. Ultimately the idea is that Acquisition/SequentialBatchAcquisition become the same thing (set_batch by default doing nothing, for batch support it is sufficient to override). The design of the UML also takes into account the current capabilities of the framework to add and multiply, use MCMC etc. Your code will be easy to fit in so don't worry about this for now.
Kriging Is Well-Suited to Parallelize Optimization. | ||
""" | ||
|
||
def __init__(self, model, batch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking to derive batch_size in acquisition each time set_batch is called, so we can keep batch_size in BayesianOptimizer. This would also permit changing the size of the batches during the optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
This is a first attempt to partially address #71
Minimal example (modified version of this):
In my setup, this PR and the original
master
code occasionally exhibits: