Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task service consolidation + S3/SES #74

Merged
merged 5 commits into from
Jul 13, 2017

Conversation

dcgoss
Copy link
Member

@dcgoss dcgoss commented Jul 12, 2017

Motivation

This PR merges the relevant functionality from task-service into core-service, leaving Cognoma with one primary backend server: core-service (resolves #73, #30). Much of the code in this PR is ported from task-service.

core-service must fit in with Cognoma's data processing pipeline, specifically the ml-workers repo. This necessitates:

  • Providing queue functionality so that ml-workers can pull classifier tasks off the queue.
  • Endpoints for ml-worker to update the status of a classifier, ex. "failed". Also an endpoint for releasing the classifier if ml-worker cannot finish processing it for whatever reason.
  • Storing notebook files on classifier objects and providing an endpoint where a worker can upload a completed notebook to be attached to a classifier. This should also update the classifier status to "completed".

Changes

  • Lots of new columns on Classifier objects, ported over from the Task and TaskDef objects from task-service. See the updates to api.md for more info.
  • Storing files (including notebooks) in S3 (resolves Serving Static Files #69)
  • Sending SES email updates to users when their classifier finishes processing (resolves Sending email after notebook upload to classifier #71)
  • POST /classifiers/id/upload enables ml-worker to upload completed notebooks. API looks for the notebook in the FILES section of the request under the key notebook_file (resolves Upload endpoint to store completed notebook on classifier object #70). This sets the classifier status to "completed".
  • POST /classifiers/id/fail enables ml-worker to update the classifier status to "failed" when processing fails.
  • POST /classifier/id/release enables ml-worker to put a classifier back on the queue (sets status to "queued") if it cannot finish processing for whatever reason.

Functional Tests

  • Queue tests: pulling from queue, queue ordering, pull limit parameter
  • Updating classifier status (imitating an ml-worker)
  • POST /classifiers/id/upload both as a non internal service and as internal service (to verify that only an internal service can perform this action)

Credits

This PR expands upon the work from:

dcgoss added 4 commits July 10, 2017 10:29
* Testing CircleCI

* Testing new AWS IAM credentials/permissions

* Increased deployment timeout threshold (cognoma#64)

* Increased ecs_deploy timeout threshold to 180 seconds

The most recent deploy took about 120 seconds, but the current
threshold for the ecs_deploy script timeout is 90 seconds. This causes
builds on CircleCI to fail with the red X - not a good look. 180
seconds plays it on the safe side without being too long.

* Create task after classifer

* Task creation working

* Expand task on get

* Expand on classifier post

* Starting task creation tests

* Expanding task(s)

* Fixed tests

The unique ID of a task was causing problems in the testing environment
when integrating with a task-service container. This commit resolves
that problem by detecting if tests are running and generating a random
unique ID if that is the case.

* Forgot to import Gene

* Fixed task-def creation request data

In accordance with commits made in the task-service repository

* Added endpoint for completed notebook upload to classifier (for ml-worker)

When ml-worker completes running a notebook, it needs to upload the
completed notebook to core-service so that core-service can send an
email to the user with a link to download their completed notebook.

This commit enabled that functionality by adding:
- New authentication permission designed to only allow an internal
service to upload a notebook
- notebook_file attribute to Classifier model and serializer
- rudimentary file storage logic (stores files locally under
/media_files/notebook/classifier_<id>.ipynb, no S3 integration yet)
- Tests for notebook uploads, which include uploading a real notebook
- Whenever a test is ran that creates a file, you are always left with
the directory still on your filesystem after the test. I added a test
runner file which will delete the media_files directory after testing

* Make classifiers write-once only

For now, we will assume classifiers cannot be updated.
* Added sending email upon notebook upload

* Added S3 integration with django-storages

Followed this guide:
https://www.caktusgroup.com/blog/2014/11/10/Using-Amazon-S3-to-store-you
r-Django-sites-static-and-media-files/

* Fixed isses with sending email
All of the task-service functionality is now ported over into
core-service, including queueing, serialization, views, etc. All of the
relevant columns that used to be stored on Task and TaskDef objects in
task-service are now stored directly on the classifier in core-service.
This greatly simplifies Cognoma’s overall architecture and codebase.
@dcgoss dcgoss changed the title Task service consolidation Task service consolidation + S3/SES Jul 12, 2017
dcgoss added a commit to dcgoss/ml-workers that referenced this pull request Jul 12, 2017
PR cognoma/core-service#74 will consolidate
task-service and core-service, leaving just core-service. That PR also
changes terminology from “task” to “classifier”, as the classifier
object will store all of the relevant Task & TaskDef columns.

So, this commit removes interfaces to task-service and replaces them
with interfaces to core-service.
@dcgoss
Copy link
Member Author

dcgoss commented Jul 12, 2017

Hey @awm33 not sure if you have time to review this but since you're basically the father of these repos figured I'd ask

@dcgoss
Copy link
Member Author

dcgoss commented Jul 13, 2017

Since this PR is blocking progress and Cognoma isn't live yet, we are going to go ahead with merging this PR and reviewing while the rest of the stack is being deployed. Dongbo did a preliminary check and we will go into deeper review later.

@dcgoss dcgoss merged commit 3fe8eb8 into cognoma:master Jul 13, 2017
@dcgoss dcgoss deleted the task-service-consolidation branch July 13, 2017 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant