-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/attachments diff #807
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I'm not completely done with this review yet, but I wanted to put down these comments before calling it quits for today.
I wouldn't have attacked the logic in this way, and I'm not sure it's exactly robust, but great work in getting this to work at all.
I think the attachments in the diff are a bit confusing with the different kinds of titles and lack of spacing. Perhaps rather than two titles right after each other we could just have one title containing "Documents for the proposal as a whole", "Documents for trajectory 1" etc...
Also something went wrong with the merge and this branch still references has_legacy_documents
which causes an error.
proposals/utils/pdf_diff_sections.py
Outdated
if slot.attachment in self.overlap_atts: | ||
master_dict[relevant_owner].append( | ||
( | ||
self._create_attachment_section(slot), | ||
self._create_attachment_section(slot), | ||
) | ||
) | ||
# if the attachment is unique to the new p, add None as the AttachmentSection for the old_p | ||
if slot in self.uniq_new_slots: | ||
master_dict[relevant_owner].append( | ||
( | ||
None, | ||
self._create_attachment_section(slot), | ||
) | ||
) | ||
# if there is a parent, use the parent as the attachment for the old slot | ||
if slot.attachment.parent: | ||
master_dict[relevant_owner].append( | ||
( | ||
self._create_attachment_section(slot, slot.attachment.parent), | ||
self._create_attachment_section( | ||
slot, | ||
), | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should be using elif
here. An attachment can both be in the overlap, but also have a parent (from an even earlier revision). In such cases an attachment would be appended twice.
proposals/utils/pdf_diff_sections.py
Outdated
# if the owner is not a proposal, and therefore a study, and there is only one study in the new_p, add it to that study | ||
elif len(new_p_owners) == 2: | ||
relevant_owner = new_p_owners[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but why? Doesn't this mean if my old_p
had four trajectories, and my new_p
only has one, that all previous attachments will be crammed beneath that single new trajectory?
proposals/utils/pdf_diff_sections.py
Outdated
self.uniq_new_slots = [ | ||
slot | ||
for slot in self.new_slots | ||
if slot.attachment not in self.overlap_atts and not slot.attachment.parent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's 100% safe to assume unique new slots don't have a parent. The secretary could have detached an older attachment through the special edit link.
Recursive slot matching algorithm. Less concise than a for loop, but I liked making it. No need to use this, just wanted to share def get_order(slot):
if type(slot.attached_object) is Proposal:
return 0
return slot.attached_object.order
def insert_into_matches(old_slot, new_slot, matches):
if new_slot:
order = get_order(new_slot)
else:
order = get_order(old_slot)
if order not in matches.keys():
matches[order] = []
matches[order].append(
(old_slot, new_slot),
)
def match_slot(slot, slots):
if slots == []:
# If we've run out of slots, no match was found
return False
potential_match = slots[0]
if slot.attachment is potential_match.attachment:
return potential_match
if slot.attachment.parent is potential_match.attachment:
return potential_match
# Continue with the next potential match
return match_slot(
slot, slots[1:]
)
def get_matches_from_slots(old_slots, new_slots, matches=dict()):
"""
Tail recursive function that returns a dictionary of integers to tuples
of (old, new) attachment slot sets, either of which may be None, but not
both.
"""
if new_slots == []:
# If we've run out of new slots, start going through the yet
# unmatched old slots.
if old_slots == []:
# If no old slots remain, return the matches
return matches
unmatched_old_slot = old_slots[0]
# We've already run out of new slots, so these must be
# unmatched slots and we can just insert them.
insert_into_matches(
unmatched_old_slot, None, matches,
)
# Continue until old slots run out
return get_matches_from_slots(
old_slots[1:], new_slots, matches=matches,
)
current_slot = new_slots[0]
match = match_slot(current_slot, old_slots)
if match:
# If we have a match, insert it and remove the matched
# old slot from the pool
insert_into_matches(
match, current_slot,
)
old_slots.remove(match)
# Continue with the next new slot
return get_matches_from_slots(
old_slots, new_slots[1:], matches=matches,
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one major flaw in the match function
proposals/utils/pdf_diff_sections.py
Outdated
potential_match = slots[0] | ||
if slot.attachment == potential_match.attachment: | ||
return potential_match | ||
if slot.attachment.parent.get_correct_submodel() == potential_match.attachment: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, if there's no parent then this raises an AttributeError
.
Example fix:
def _match_slot(self, slot, slots):
"""
Attempts to match an new slot to an old slot by recursively looping
over the old_slots. It returns a match if the attachment is unchanged
or if the slot's attachment is a child of an attachment in the old_slots
"""
if slots == []:
# If we've run out of slots, no match was found
return False
potential_match = slots[0]
# We match against the exact same attachment or its parent,
# if it has one
targets = [slot.attachment]
if slot.attachment.parent:
targets.append(slot.attachment.parent.get_correct_submodel())
if potential_match.attachment in targets:
# Success, return the match
return potential_match
# Else, continue with the next potential match
return self._match_slot(slot, slots[1:])
unmatched_old_slot = old_slots[0] | ||
# We've already run out of new slots, so these must be | ||
# unmatched slots and we can just insert them. | ||
self._insert_into_matches( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently I couldn't decide on whether this should be a stateful function or not. Best practice is to choose either of these two signatures and stick with it:
Stateful:
self._insert_into_matches(matches)
Stateless:
matches = self._insert_into_matches(matches)
But I've allowed for and used both in this case. Please make all occurrences the stateless variant.
proposals/utils/pdf_diff_sections.py
Outdated
return KindRow(self.obj, self.obj.attachment, field) | ||
else: | ||
obj = self.obj.attachment | ||
|
||
if field in self.get_row_fields(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to wake sleeping dogs here, but why do we check if the requested field is in self.get_row_fields()
here? Is this method called anywhere else than from self.make_rows()
, which just loops over those exact fields?
@@ -554,14 +556,22 @@ def get_row_fields(self): | |||
return rows | |||
|
|||
|
|||
|
|||
|
|||
class AttachmentSection(BaseSection): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think creating a subclass here is a good way of doing things. You want subclasses to be stable at the interface, which is the case here.
This isn't as clean as it could be because of the Row
class. BaseSection
is nice to subclass because it has a generic interface. But Row
is kind of annoying because it's purpose-built for Django models, without a generic interface, which makes you work around it.
So in the future my advice would be to keep the generic interface in mind from the start. So you would have a BaseRow
where you define how other classes should talk to it, and then a DjangoModelRow
that implements that interface. Even if you only ever use django model rows, your classes still become easier to talk to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
…opment/ethics into feat/attachments_diff
So this is the diff part of incorporating the attachments into the pdf/diff system.
It build upon how the pdf is created, however it gets a lot more complicated with matching attachments together.
I think this solution is ok ... but it is very mad scientist-y ... But it was quite a logical challenge to match attachments to a study.
This PR also assumes some knowledge of how Diffsections work (which is also quite mad scientist-y. A basic rundown:
DiffSections receive two sections, the first for the old_p, the second for the new_p. If either of these objects is missing, it gets replaced with None.
Good luck with this review as well!