Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update API to deploy separate endpoints by run ID #2

Merged
merged 13 commits into from
May 3, 2024

Conversation

jeancochrane
Copy link
Contributor

@jeancochrane jeancochrane commented Apr 23, 2024

This PR updates api.R to deploy multiple /predict endpoints, one for each of a set of valid run IDs. The new API structure looks like this:

  • POST /predict: Predicts with a default model, configured via the AWS_S3_MODEL_RUN_ID environment variable
  • POST /predict/<run_id>: Predicts with a model determined by the run ID passed to the endpoint

Maintaining a POST /predict endpoint that falls back to a default model should allow us to deploy this change without disrupting the operation of the workbooks that Valuations is currently using for desk review.

Closes #1.

Deployment steps

  • Notify Valuations and IT that we are deploying changes to the API
  • Merge the PR
  • Pull the latest image on the server
  • Make sure the env vars are up to date
    • Rename AWS_S3_MODEL_RUN_ID to AWS_S3_DEFAULT_MODEL_RUN_ID
    • Remove AWS_S3_DVC_BUCKET
    • Remove AWS_S3_MODEL_YEAR
  • Try removing privileged: true
  • Restart the compose process and see if it works
  • Modify and reupload any API workbooks that need to change

@jeancochrane jeancochrane linked an issue Apr 23, 2024 that may be closed by this pull request
api.R Outdated
Comment on lines 14 to 17
} else {
} else if (file.exists("secrets/ENV_FILE")) {
readRenviron("secrets/ENV_FILE")
} else {
readRenviron(".env")
}
readRenviron(".env")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here was a bit confusing for local development, where neither /run/secrets/ENV_FILE nor secrets/ENV_FILE exist. I think the new conditional structure should make local development easier, but let me know if I'm misinterpreting something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is pretty confusing and underdocumented. If I recall, the file secrets/ENV_FILE is mounted to /run/secrets/ENV_FILE when using compose. This file contains AWS creds specific to the API. During development, these creds would be unnecessary as the user would likely have active AWS creds via aws-mfa.

The .env file is separate and not related. It contains the rest of the setup vars used by compose (API_PORT, CCAO_REGISTRY_URL, etc.) and the final model ID and year. In other words, it's just config stuff, not actually secret. This file is necessary to load during development, but NOT when deployed via compose (compose adds all vars in a .env file to the deployed container). So, the logic here makes sense if you remove the change on line 15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I refactored in abe5403 to always load this file and only load secrets/ENV_FILE if it exists (similar to /run/secrets/ENV_FILE).

api.R Show resolved Hide resolved
api.R Outdated
api_port <- as.numeric(Sys.getenv("API_PORT", unset = "3636"))
default_run_id_var_name <- "AWS_S3_MODEL_RUN_ID"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to change the name of this env var to something that more clearly marks it as the default (like AWS_S3_DEFAULT_MODEL_RUN_ID), but keeping the same name for now means one fewer thing we have to change during deployment. I'm open to changing it now if you feel strongly about it, however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it now and make a list of things we actually need to change when redeploying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in abe5403, with an updated list of deploy steps in the PR body.

api.R Outdated
Comment on lines 30 to 31
"2024-02-06-relaxed-tristan",
"2024-03-17-stupefied-maya"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any other run IDs that we should include in this vector for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's actually include the final 2022 and 2023 models. We can just reproduce and replace the old workbooks for those years.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, done in abe5403. Are the old workbooks even still operational, given that the model version in the (currently static) API has changed since 2022 and 2023?

api.R Outdated Show resolved Hide resolved
renv.lock Outdated Show resolved Hide resolved
renv.lock Show resolved Hide resolved
renv.lock Show resolved Hide resolved
renv.lock Show resolved Hide resolved
renv.lock Outdated Show resolved Hide resolved
@jeancochrane jeancochrane marked this pull request as ready for review April 23, 2024 17:54
@jeancochrane jeancochrane requested a review from dfsnow April 23, 2024 17:54
Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeancochrane Like the simplicity here, this looks good but for the env management stuff. Let's add an endpoint for all final models we have the necessary artefacts for, that should be 2022 onward.

We'll also need to update the API workbooks to handle the change (pointing to the endpoint with run_id as a param, rather than the default).

api.R Outdated
Comment on lines 14 to 17
} else {
} else if (file.exists("secrets/ENV_FILE")) {
readRenviron("secrets/ENV_FILE")
} else {
readRenviron(".env")
}
readRenviron(".env")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is pretty confusing and underdocumented. If I recall, the file secrets/ENV_FILE is mounted to /run/secrets/ENV_FILE when using compose. This file contains AWS creds specific to the API. During development, these creds would be unnecessary as the user would likely have active AWS creds via aws-mfa.

The .env file is separate and not related. It contains the rest of the setup vars used by compose (API_PORT, CCAO_REGISTRY_URL, etc.) and the final model ID and year. In other words, it's just config stuff, not actually secret. This file is necessary to load during development, but NOT when deployed via compose (compose adds all vars in a .env file to the deployed container). So, the logic here makes sense if you remove the change on line 15.

api.R Show resolved Hide resolved
api.R Outdated
api_port <- as.numeric(Sys.getenv("API_PORT", unset = "3636"))
default_run_id_var_name <- "AWS_S3_MODEL_RUN_ID"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it now and make a list of things we actually need to change when redeploying.

api.R Outdated
Comment on lines 30 to 31
"2024-02-06-relaxed-tristan",
"2024-03-17-stupefied-maya"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's actually include the final 2022 and 2023 models. We can just reproduce and replace the old workbooks for those years.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why we're running privileged: true here, but we should try to remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in d8bf46f, let's see what happens!

renv.lock Outdated Show resolved Hide resolved
api.R Outdated Show resolved Hide resolved
api.R Outdated
all_endpoints[[i]] <- list(
path = glue::glue("{base_url_prefix}/{run$run_id}"),
model = model,
is_default = run$run_id == default_run$run_id
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar vein, it would make a lot more sense to just append another entry to all_endpoints for the default run rather than have to check the is_default bool in all of the iteration blocks that follow, but I couldn't figure out a good way to do this given the way append operations in R seem to require index references (i.e. the all_endpoints[[i]] assignment on line 167).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use

all_endpoints <- c(all_endpoints, default_endpoint)`

or:

all_endpoints <- append(all_endpoints, default_endpoint)`

If that simplifies things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, that's much clearer! I made this change in 2a770c0. Sadly I needed to upgrade the entire set of dependencies in order to get the environment working with R 4.4.0 and test this, which blew up the diff in bc8c7f5; I'll go in and comment the extra changes I made to make it a bit clearer.

api.R Show resolved Hide resolved
api.R Outdated Show resolved Hide resolved
api.R Outdated Show resolved Hide resolved
generics.R Outdated Show resolved Hide resolved
@jeancochrane jeancochrane requested a review from dfsnow April 25, 2024 22:19
@jeancochrane
Copy link
Contributor Author

Ready for another look @dfsnow!

Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @jeancochrane. One potential simplification and then this is set to deploy.

api.R Outdated Show resolved Hide resolved
api.R Outdated
all_endpoints[[i]] <- list(
path = glue::glue("{base_url_prefix}/{run$run_id}"),
model = model,
is_default = run$run_id == default_run$run_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use

all_endpoints <- c(all_endpoints, default_endpoint)`

or:

all_endpoints <- append(all_endpoints, default_endpoint)`

If that simplifies things.

api.R Outdated Show resolved Hide resolved
Comment on lines +223 to +224
use_path_in_nav_bar = TRUE,
show_method_in_nav_bar = "as-plain-text"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the rapidoc UI layout changed with the version update, so we need these two params in order to properly display all of the available endpoints (docs).

Comment on lines -13 to -15
handler_startup._lgb.Booster <- function(vetiver_model) {
attach_pkgs(vetiver_model$metadata$required_pkgs)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out vetiver_create_description and vetiver_create_meta are in fact necessary for calling vetiver_model on the model, so I added them back in and only removed handler_startup, which we no longer call after moving from vetiver to raw Plumber.

renv.lock Outdated Show resolved Hide resolved
@@ -1,7 +1,7 @@
sandbox/
library/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renv/.gitignore, renv/activate.R, and renv/settings.dcf all got automatically updated as part of the upgrade to renv 1.0.x and R 4.4.x.

@jeancochrane
Copy link
Contributor Author

@dfsnow I think we should get one last look at this, unfortunately I had to make a lot of changes since the last round of review!

@jeancochrane jeancochrane requested a review from dfsnow May 1, 2024 20:33
Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @jeancochrane. I'm excited to see if we can recreate workbooks/create tools to cover all the previous years.

Comment on lines +16 to +17
dvc_bucket_pre_2024 <- "s3://ccao-data-dvc-us-east-1"
dvc_bucket_post_2024 <- "s3://ccao-data-dvc-us-east-1/files/md5"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa, I didn't realize that DVC had changed the file locations :/

default_run_id <- Sys.getenv(default_run_id_var_name)

# The list of runs that will be deployed as possible model endpoints
valid_runs <- rbind(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): It might be worth just updating the model.final_model dbt seed to capture this list and its attributes. That seed only gets updated at the end of each year, so we could just basically down > up the compose stack to get the new endpoint, rather than modifying the code.

If this sounds like a good idea, let's push it to another separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, issue opened here: ccao-data/data-architecture#428

@jeancochrane jeancochrane merged commit 36b21d4 into master May 3, 2024
2 checks passed
@jeancochrane jeancochrane deleted the jeancochrane/1-fixup-existing-res-api-backend branch May 3, 2024 17:41
dfsnow added a commit to ccao-data/model-res-avm that referenced this pull request May 14, 2024
See ccao-data/api-res-avm#2 for actual API
refactor. Workbooks now select their run ID/target endpoint from cell
B1.
dfsnow added a commit to ccao-data/model-res-avm that referenced this pull request May 14, 2024
See ccao-data/api-res-avm#2 for actual API
refactor. Workbooks now select their run ID/target endpoint from cell
B1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fixup existing res API backend
2 participants