-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model 7: Add unincorporate_reference method. #204
Model 7: Add unincorporate_reference method. #204
Conversation
@@ -132,7 +132,6 @@ class CleanRelation : public Relation<T> { | |||
// It's safe to unincorporate this element since no other data point | |||
// refers to it. | |||
data_r.at(name).erase(items[i]); | |||
domains[i]->unincorporate(items[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this is a bug, since domains can be shared across relations (so just because the domain doesn't take a given value in this relation doesn't mean it can't in other relations). (We had resurrected the unincorporate
code from an obsolete commented-out version.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Do we want to add a TODO here or create an issue to come up with some sort of scheme for garbage collecting domains when they become completely unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The garbage collection is now done in irm.incorporate. I added a comment to Relation::unincorporate
to explain.
I disabled pclean_lib_test because there's a bug where make_pclean_samples isn't sampling unique entity vectors (and incorporating the same set of entity vectors twice should raise an assertion -- it wasn't doing that earlier, and this PR fixes that). I confirmed that it isn't sampling unique entity vectors at head. |
@@ -132,7 +132,6 @@ class CleanRelation : public Relation<T> { | |||
// It's safe to unincorporate this element since no other data point | |||
// refers to it. | |||
data_r.at(name).erase(items[i]); | |||
domains[i]->unincorporate(items[i]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Do we want to add a TODO here or create an issue to come up with some sort of scheme for garbage collecting domains when they become completely unused?
cxx/gendb.cc
Outdated
@@ -165,4 +165,128 @@ T_items GenDB::sample_class_ancestors(std::mt19937* prng, | |||
return items; | |||
} | |||
|
|||
// This function walks the tree of reference indices to populate the "items" | |||
// vector. It should be called with ind = items.size() - 1 and class_item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have ind as a parameter then? Or do you mean ind <= items.size() - 1 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the top-level call (updated the comment).
|
||
// If this relation doesn't contain the relevant class/reference field, skip | ||
// it. | ||
if (domain_inds.size() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there non-error reasons why this can happen? If not, should we print a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- I expanded the comment.
@@ -25,14 +26,36 @@ void PCleanSchemaHelper::compute_domains_for(const std::string& name) { | |||
std::vector<std::string> annotated_ds; | |||
PCleanClass c = schema.classes[name]; | |||
|
|||
for (const auto& v: c.vars) { | |||
// Recursively maps the indices of class "name" (and ancestors) in relation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can all of this new code go into a new method, compute_ref_indices() ? It would be great if compute_domains_cache() could stick to its original mission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (not sure which I prefer here, combining them saved some duplicate code and ensured that the domains cache was populated first, which the caller now has to ensure).
No description provided.