Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FC-0004][RG]/edxoldmng 218/Signals to save/delete certificate configuration for a course. #158

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions openedx_events/content_authoring/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
(see deprecation proposal at https://github.com/openedx/public-engineering/issues/160)
"""
from datetime import datetime
from typing import BinaryIO, List

import attr
from opaque_keys.edx.keys import CourseKey, UsageKey
Expand Down Expand Up @@ -82,3 +83,56 @@ class DuplicatedXBlockData(XBlockData):
"""

source_usage_key = attr.ib(type=UsageKey)


@attr.s(frozen=True)
class CertificateSignatoryData:
"""
Attributes defined for Open edX CertificateSignatory data object.

Arguments:
image (BinaryIO): certificate signature image.
name (str): name of signatory.
organization (str): organization that signatory belongs to.
title (int): signatory title.
"""

# Note: Please take care that the image field is BinaryIO, which means
# that a file can be passed as an array of bytes. Watch the size of this file.
# It can potentially be large, making it difficult to pass this data structure through the Event Bus
# (CloudEvent messages, which should be 64K or less) and store it on disk space.
# We suggest referring to MAX_ASSET_UPLOAD_FILE_SIZE_IN_MB, i.e. restriction in the Studio for such cases.
image = attr.ib(type=BinaryIO)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a URL instead (image_url)? Signals will get translated to CloudEvent messages, which should be 64K or less. There's no guarantee that a professor's signature will fit in that limit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question and I thought about it.
Why did I do this?
First, this is not the final version yet. What we are doing is an intermediate stage of moving towards where the configuration of certificates, the certificates themselves will be located in Credentials service.
Now the signature files are saved in MongoDB as an asset and passed to Credentials via HTTP. Also, it is passed as an ordinary multipart request where files are sent as bytes. This is the usual approach for sending files. Yes, now we use Event and look to the future, we are thinking about the future architecture, where there is a message Event Bus and events. We are doing our best to make it easy to switch to this, but for now, we are temporarily passing the certificate configuration through the HTTP API. So far, as we agreed before.

For this to be image_url, then, when saving the configuration from Studio UI, you first need to save the file somewhere on a file system or in a cloud and pass it to the Credentials service, and this service itself will also have access to this shared storage. When we do not have Mongo DB for cert configs and signatures, and we keep this in mind, then these certificate configuration files will not be saved there and they need to be passed immediately to Credentials. Then in order for it to be image_url, it must first be saved somewhere in intermediate public storage. But the problem is that I do not yet know what it will be and what architectural solution will be applied. This is not known. Since I wrote at the very beginning that we are at an intermediate stage of transition. 


What you're talking about, where size matters is easy to solve with server-side validation, and avoid where you need to transfer a large file. Or do post-processing of the file before passing this information through the bus. But all this will be decided and a decision will be made on how to do it a little later, at the next stage, so to speak, when more becomes known and when the transfer of the certificate configuration will be completely through EvenBus, and not imitation as we are doing now. That's why for now I propose to leave it as it is, but keep in mind those solutions and problems that I described.

What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comments had two assumptions:

  1. Backwards compatibility concerns will require us to continue to support whatever fields are added over the long term. In order to not break plugins in the future, openedx-events opts for a pretty strict conventions around schema evolution.
  2. There is already a publicly accessible image URL representing the signature, since we need to display that signature to users–so we wouldn't have to save the file anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dave, I suggest we take a step back if you don't mind. I want us to be on the same page and I'm afraid to take sharp, thoughtless steps.
Since we are talking about implementation, I suggest going to this PR openedx/edx-platform#31446. This is where the implementation is. You will be able to understand what was done and how. We will approve the final decision, and then we will continue our work here.

What do you say?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, let's just see how this goes. You can keep the BinaryIO image, but please log how large they get in the code that constructs these things so that we can examine that data after edx.org runs it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do.

# end Note
name = attr.ib(type=str)
organization = attr.ib(type=str)
title = attr.ib(type=str)


@attr.s(frozen=True)
class CertificateConfigData:
"""
Attributes defined for Open edX CertificateConfig data object.

Arguments:
certificate_type (str): certificate type. Possible types are certificate relevant course modes:
- credit,
- verified,
- professional,
- no-id-professional,
- executive-education,
- paid-executive-education,
- paid-bootcamp,
- masters.
course_key (CourseKey): identifier of the Course object.
title (str): certificate title.
signatories (List[CertificateSignatoryData]): contains a collection of signatures
that belong to the certificate configuration.
is_active (bool): indicates whether the certifivate configuration is active.
"""

certificate_type = attr.ib(type=str)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docstring, please either enumerate the types possible, or describe where the type field data comes from and give examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. I'll do it. Thank you.

course_key = attr.ib(type=CourseKey)
title = attr.ib(type=str)
signatories = attr.ib(type=List[CertificateSignatoryData], factory=list)
is_active = attr.ib(type=bool, default=False)
32 changes: 31 additions & 1 deletion openedx_events/content_authoring/signals.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@
docs/decisions/0003-events-payload.rst
"""

from openedx_events.content_authoring.data import CourseCatalogData, DuplicatedXBlockData, XBlockData
from openedx_events.content_authoring.data import (
CertificateConfigData,
CourseCatalogData,
DuplicatedXBlockData,
XBlockData,
)
from openedx_events.tooling import OpenEdxPublicSignal

# .. event_type: org.openedx.content_authoring.course.catalog_info.changed.v1
Expand Down Expand Up @@ -62,3 +67,28 @@
"xblock_info": DuplicatedXBlockData,
}
)


# .. event_type: org.openedx.content_authoring.course.certificate_config.changed.v1
# .. event_name: COURSE_CERTIFICATE_CONFIG_CHANGED
# .. event_description: Fired when a course certificate configuration changes in Studio.
# Warning: This event is currently incompatible with the event bus, list/dict cannot be serialized yet
# .. event_data: CertificateConfigData
COURSE_CERTIFICATE_CONFIG_CHANGED = OpenEdxPublicSignal(
event_type="org.openedx.content_authoring.course.certificate_config.changed.v1",
data={
"certificate_config": CertificateConfigData,
}
)

# .. event_type: org.openedx.content_authoring.course.certificate_config.deleted.v1
# .. event_name: COURSE_CERTIFICATE_CONFIG_DELETED
# .. event_description: Fired when a course certificate configuration deletes in Studio.
# Warning: This event is currently incompatible with the event bus, list/dict cannot be serialized yet
# .. event_data: CertificateConfigData
COURSE_CERTIFICATE_CONFIG_DELETED = OpenEdxPublicSignal(
event_type="org.openedx.content_authoring.course.certificate_config.deleted.v1",
data={
"certificate_config": CertificateConfigData,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to pass all this information if it's been deleted? Can we just pass the course_key instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should. Why did I do this? It's all about the low coupling of the two services Studio and Credentials. We have to support it and not depend on each other. How can we efficiently delete resources in general? How do we do it with a regular http request in monolithic systems, i.e. explicitly pass an entity identifier, which is usually the primary key in a database table, or pass some unique field at the database level. Then the removal will be accurate and correct.

It’s not about our case now. Studio does not store the ID and does not know about the model from the Credentials database. It's good. course_key may or may not be a unique field. Studio doesn't know this for sure and shouldn't know. Only Credentials knows this. What can we do then when we keep services loosely coupled? Pass on as much meaningful information as possible and ask for Credentials to be removed. And Credentials itself will decide which record in the table needs to be deleted, since it is an information expert here. Which is what we do.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

course_key may or may not be a unique field.

But there's only ever one credentials configuration for a given course_key, right? When would we ever have more than one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not certainly in that way. Look at this model please https://github.com/openedx/credentials/blob/master/credentials/apps/credentials/models.py#L194-L230
Here course_id is just a CharField and not unique. And here is our constraint unique_together = (("course_id", "certificate_type", "site")),) which explains the behavior.

And in any case, it's not so important. After all, as I wrote above, Studio is not an information expert for Credentials and does not know the exact URI of this resource. It is important for us to maintain low coupling and sending more data will not be superfluous here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I misunderstood in that I thought CertificateConfigData was the representation of all certificates for a given course (i.e. a cleaner version of the entire certificates attribute off of the CourseBlock), rather than being the configuration for one particular certificate type for a course.

}
)
6 changes: 5 additions & 1 deletion openedx_events/event_bus/avro/tests/test_avro.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@

# If a signal is explicitly not for use with the event bus, add it to this list
# and document why in the event's annotations
KNOWN_UNSERIALIZABLE_SIGNALS = ["org.openedx.learning.discussions.configuration.changed.v1"]
KNOWN_UNSERIALIZABLE_SIGNALS = [
"org.openedx.learning.discussions.configuration.changed.v1",
"org.openedx.content_authoring.course.certificate_config.changed.v1",
"org.openedx.content_authoring.course.certificate_config.deleted.v1",
]


def generate_test_event_data_for_data_type(data_type):
Expand Down