Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require a channel ID before sushi chef can run #203

Closed
kollivier opened this issue Mar 19, 2019 · 7 comments
Closed

Require a channel ID before sushi chef can run #203

kollivier opened this issue Mar 19, 2019 · 7 comments
Assignees
Milestone

Comments

@kollivier
Copy link
Contributor

kollivier commented Mar 19, 2019

  • ricecooker version:
  • Python version:
  • Operating System:

Description

The generation of channel IDs from channel information can lead to issues when that information changes, or when two channels have the same source fields but are different. (e.g. cheffer A doesn't know about chef B's source info and uses the same thinking it is not taken.) When this happens, it leads to weird errors like a new chef run creating a whole new channel, or permissions errors when trying to create a channel. For an example, see:

https://sentry.io/organizations/learningequality/issues/943392969/?environment=master&project=1252819&referrer=alert_email&statsPeriod=14d

An alternative that I think will fix this is to, when creating a channel, always generate a fully random UUID, and the chef cannot run until it is given this UUID. The UUID should be committed in the chef's code so that all chef runs will use it. In the case that a chef with a blank ID is run, it will spit out a UUID and ask the user to paste that into their chef code.

This ensures that no two chefs can ever try to update each other accidentally. We can keep source ID and source domain for now, but I would recommend a plan to migrate to always using the UUID as the source of truth for pointing to the channel.

@kollivier kollivier self-assigned this Mar 19, 2019
@kollivier
Copy link
Contributor Author

@ivanistheone Thoughts on this proposal?

@ivanistheone
Copy link
Contributor

I don't have anything against this proposal and I don't think anything will break if we switch to truly random UUIDs (e.g. uuid.uuid4), but at the same time I don't see much benefits either as both the unintentional-source_id reuse and source_id changing are pretty rare cases.

@jayoshih
Copy link
Contributor

If the main use case is to avoid people overwriting each other's channels, there are two potential things that are being discussed on Studio that might help with this problem:

  1. Have separate roles for "owning" a channel vs. just an editor. Right now, ricecooker only checks if the chef token is one of the channel's editor's. We could be stricter on how we select channel "owners"
  2. Match chef run authentication details as suggested here. However, transferring chef access might be a little trickier than the first potential solution

@kollivier
Copy link
Contributor Author

Overwriting isn't my concern here as permissions protects against that, and it's clear what to do when we want to add a new editor to a channel. I'm concerned more that this leads to unexpected behavior / results when cheffing.

A recent example of this was where a chef had an incorrect source_id value put in programmatically, and correcting it created a whole new channel. The fact that a new channel got created wasn't immediately realized, however, and QA happened on the old channel as a result. Fortunately the two channels were very similar and could easily be switched, but if they weren't (e.g. a major update), it could have led to hours of wasted work and probably some not-too-happy QA folk as we would need to re-QA the entire channel.

I think part of the reason I'm more concerned about this is that sushi cheffing code is probably the code most likely to be used by numerous third-parties, some with limited coding experience, so I worry more about expected behavior and predictability among the code. The source_id + source_domain = UUID approach breaks a couple very common assumptions that coders are used to:

  1. Once an ID for an object (Channel) is created, it will not change and is unique among the data set.
  2. Changing object properties won't alter its ID.

We know source_id + source_domain is special and ties into ID creation, so this is no big deal for us. For people not aware of that, the behavior is both surprising and unexpected. They may accidentally create a second channel and not even realize it, and spend hours trying to figure out why their changes aren't showing up on their channel. (And may even file a ticket about it, that doesn't mention that they changed source_id or domain as they don't see the relevance.) I know we can and do document these things, but people don't always read docs (or forget what they read if it didn't seem relevant at the time) and I think it's always best to match common best practices whenever possible to avoid surprises.

Also, we offer no tools for a developer to check what source_id + source_domain combos are already in use, nor how to generate sufficiently unique ids. Consider, particularly, that we have a lot of YouTube chefs so 'youtube.com' may (legitimately) be the source_domain for a lot of sources going forward, and if cheffers create IDs based on non-unique values like playlist name or the desired output channel (e.g. "grade12_math"), I think the chances of conflicts will increase as the number of chefs increase. If you try to create a chef and get back the error 'you do not have permissions to edit this channel' (there are Sentry reports of this with people try to run tutorial code), I think you will be legitimately confused.

Having written all this :), I want to be clear that I don't see this as a pressing or urgent matter, but I do feel it is the sort of problem that, if not addressed, could lead to a lot of wasted time and effort down the road, so I wanted to put it on the agenda.

@kollivier
Copy link
Contributor Author

Overwriting isn't my concern here as permissions protects against that, and it's clear what to do when we want to add a new editor to a channel.

Actually, I think this statement is incorrect. Permissions reduces the likelihood of overwriting, but doesn't prevent it from happening. It can still happen in the case where you have editing permissions for both channels, so that is one more case that would be addressed by having truly unique IDs.

@kollivier kollivier added this to the 0.7 milestone Dec 26, 2019
@kollivier
Copy link
Contributor Author

kollivier commented Dec 26, 2019

I'm going to prioritize this for 0.7 as I just accidentally created a new channel while updating the Blockly Games URL domain as it moved. (The old domain still had the content on it, but not sure how long that will last.) This is another confusing case as if content moves to a different domain and you update the domain, you must keep the content's domain to the original value to keep the same channel ID.

@rtibbles
Copy link
Member

Closing in favour of #323

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants