Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/attachments diff #807

Merged
merged 23 commits into from
Dec 18, 2024
Merged

Conversation

EdoStorm96
Copy link
Contributor

So this is the diff part of incorporating the attachments into the pdf/diff system.

It build upon how the pdf is created, however it gets a lot more complicated with matching attachments together.

I think this solution is ok ... but it is very mad scientist-y ... But it was quite a logical challenge to match attachments to a study.

This PR also assumes some knowledge of how Diffsections work (which is also quite mad scientist-y. A basic rundown:

DiffSections receive two sections, the first for the old_p, the second for the new_p. If either of these objects is missing, it gets replaced with None.

Good luck with this review as well!

@EdoStorm96 EdoStorm96 requested a review from miggol December 9, 2024 11:10
Copy link
Contributor

@miggol miggol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I'm not completely done with this review yet, but I wanted to put down these comments before calling it quits for today.

I wouldn't have attacked the logic in this way, and I'm not sure it's exactly robust, but great work in getting this to work at all.

I think the attachments in the diff are a bit confusing with the different kinds of titles and lack of spacing. Perhaps rather than two titles right after each other we could just have one title containing "Documents for the proposal as a whole", "Documents for trajectory 1" etc...

Also something went wrong with the merge and this branch still references has_legacy_documents which causes an error.

Comment on lines 1008 to 1032
if slot.attachment in self.overlap_atts:
master_dict[relevant_owner].append(
(
self._create_attachment_section(slot),
self._create_attachment_section(slot),
)
)
# if the attachment is unique to the new p, add None as the AttachmentSection for the old_p
if slot in self.uniq_new_slots:
master_dict[relevant_owner].append(
(
None,
self._create_attachment_section(slot),
)
)
# if there is a parent, use the parent as the attachment for the old slot
if slot.attachment.parent:
master_dict[relevant_owner].append(
(
self._create_attachment_section(slot, slot.attachment.parent),
self._create_attachment_section(
slot,
),
)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you should be using elif here. An attachment can both be in the overlap, but also have a parent (from an even earlier revision). In such cases an attachment would be appended twice.

Comment on lines 1042 to 1044
# if the owner is not a proposal, and therefore a study, and there is only one study in the new_p, add it to that study
elif len(new_p_owners) == 2:
relevant_owner = new_p_owners[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, but why? Doesn't this mean if my old_p had four trajectories, and my new_p only has one, that all previous attachments will be crammed beneath that single new trajectory?

self.uniq_new_slots = [
slot
for slot in self.new_slots
if slot.attachment not in self.overlap_atts and not slot.attachment.parent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's 100% safe to assume unique new slots don't have a parent. The secretary could have detached an older attachment through the special edit link.

@miggol
Copy link
Contributor

miggol commented Dec 17, 2024

Recursive slot matching algorithm. Less concise than a for loop, but I liked making it. No need to use this, just wanted to share

def get_order(slot):
    if type(slot.attached_object) is Proposal:
        return 0
    return slot.attached_object.order

def insert_into_matches(old_slot, new_slot, matches):
    if new_slot:
        order = get_order(new_slot)
    else:
        order = get_order(old_slot)
    if order not in matches.keys():
        matches[order] = []
    matches[order].append(
        (old_slot, new_slot),
    )

def match_slot(slot, slots):
    if slots == []:
        # If we've run out of slots, no match was found
        return False
    potential_match = slots[0]
    if slot.attachment is potential_match.attachment:
        return potential_match
    if slot.attachment.parent is potential_match.attachment:
        return potential_match
    # Continue with the next potential match
    return match_slot(
        slot, slots[1:]
    )

def get_matches_from_slots(old_slots, new_slots, matches=dict()):
    """
    Tail recursive function that returns a dictionary of integers to tuples
    of (old, new) attachment slot sets, either of which may be None, but not
    both.
    """
    if new_slots == []:
        # If we've run out of new slots, start going through the yet
        # unmatched old slots.
        if old_slots == []:
            # If no old slots remain, return the matches
            return matches
        unmatched_old_slot = old_slots[0]
        # We've already run out of new slots, so these must be
        # unmatched slots and we can just insert them.
        insert_into_matches(
            unmatched_old_slot, None, matches,
        )
        # Continue until old slots run out
        return get_matches_from_slots(
            old_slots[1:], new_slots, matches=matches,
        )
    current_slot = new_slots[0]
    match = match_slot(current_slot, old_slots)
    if match:
        # If we have a match, insert it and remove the matched
        # old slot from the pool
        insert_into_matches(
            match, current_slot,
        )
        old_slots.remove(match)
    # Continue with the next new slot
    return get_matches_from_slots(
        old_slots, new_slots[1:], matches=matches,
    )

@EdoStorm96 EdoStorm96 requested a review from miggol December 17, 2024 14:45
Copy link
Contributor

@miggol miggol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one major flaw in the match function

potential_match = slots[0]
if slot.attachment == potential_match.attachment:
return potential_match
if slot.attachment.parent.get_correct_submodel() == potential_match.attachment:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, if there's no parent then this raises an AttributeError.

Example fix:

    def _match_slot(self, slot, slots):
        """
        Attempts to match an new slot to an old slot by recursively looping
        over the old_slots. It returns a match if the attachment is unchanged
        or if the slot's attachment is a child of an attachment in the old_slots
        """
        if slots == []:
            # If we've run out of slots, no match was found
            return False
        potential_match = slots[0]
        # We match against the exact same attachment or its parent,
        # if it has one
        targets = [slot.attachment]
        if slot.attachment.parent:
            targets.append(slot.attachment.parent.get_correct_submodel())
        if potential_match.attachment in targets:
            # Success, return the match
            return potential_match
        # Else, continue with the next potential match
        return self._match_slot(slot, slots[1:])

unmatched_old_slot = old_slots[0]
# We've already run out of new slots, so these must be
# unmatched slots and we can just insert them.
self._insert_into_matches(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I couldn't decide on whether this should be a stateful function or not. Best practice is to choose either of these two signatures and stick with it:

Stateful:

self._insert_into_matches(matches)

Stateless:

matches = self._insert_into_matches(matches)

But I've allowed for and used both in this case. Please make all occurrences the stateless variant.

return KindRow(self.obj, self.obj.attachment, field)
else:
obj = self.obj.attachment

if field in self.get_row_fields():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to wake sleeping dogs here, but why do we check if the requested field is in self.get_row_fields() here? Is this method called anywhere else than from self.make_rows(), which just loops over those exact fields?

@@ -554,14 +556,22 @@ def get_row_fields(self):
return rows




class AttachmentSection(BaseSection):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think creating a subclass here is a good way of doing things. You want subclasses to be stable at the interface, which is the case here.

This isn't as clean as it could be because of the Row class. BaseSection is nice to subclass because it has a generic interface. But Row is kind of annoying because it's purpose-built for Django models, without a generic interface, which makes you work around it.

So in the future my advice would be to keep the generic interface in mind from the start. So you would have a BaseRow where you define how other classes should talk to it, and then a DjangoModelRow that implements that interface. Even if you only ever use django model rows, your classes still become easier to talk to.

Copy link
Contributor

@miggol miggol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@EdoStorm96 EdoStorm96 merged commit 4bde745 into feat/attachments_pdf Dec 18, 2024
2 of 3 checks passed
@EdoStorm96 EdoStorm96 deleted the feat/attachments_diff branch January 16, 2025 09:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants