-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add jobs that trigger on file upload (#527)
* Refactor service name and bucket name to env-config module * Add file_upload_jobs config to env-config module * Update service module to add event rule and target to trigger jobs based on file uploads * Add role and policy that EventBridge will use to run ECS tasks * Document background job functionality Example app changes * Add etl command to example Flask app * Add etl job to file_upload_jobs config for platform-test CI changes * Bump terraform version in ci-infra-service workflow Fixes * Fixed storage access permissions to have kms:Decrypt Unrelated changes * Moved feature flags config to separate file in same module ## Context Many applications need a basic ETL system that ingests files that are uploaded to a bucket. This change adds that functionality. --------- Co-authored-by: Daphne Gold <[email protected]>
- Loading branch information
1 parent
c803535
commit 164e7e4
Showing
23 changed files
with
312 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# Background jobs | ||
|
||
The application may have background jobs that support the application. Types of background jobs include: | ||
|
||
* Jobs that occur on a fixed schedule (e.g. every hour or every night) — This type of job is useful for ETL jobs that can't be event-driven, such as ETL jobs that ingest source files from an SFTP server or from an S3 bucket managed by another team that we have little control or influence over. **This functionality has not yet been implemented** | ||
* Jobs that trigger on an event (e.g. when a file is uploaded to the document storage service). This type of job can be processed by two types of tasks: | ||
* Tasks that spin up on demand to process the job — This type of task is appropriate for low-frequency ETL jobs **This is the currently the only type that's supported** | ||
* Worker tasks that are running continuously, waiting for jobs to enter a queue that the worker then processes — This type of task is ideal for high frequency, low-latency jobs such as processing user uploads or submitting claims to an unreliable or high-latency legacy system **This functionality has not yet been implemented** | ||
|
||
## Job configuration | ||
|
||
Background jobs for the application are configured via the application's `env-config` module. The current infrastructure supports jobs that spin up on demand tasks when a file is uploaded to the document storage service. These are configured in the `file_upload_jobs` configuration. | ||
|
||
## How it works | ||
|
||
File upload jobs use AWS EventBridge to listen to "Object Created" events when files are uploaded to S3. An event rule is created for each job configuration, and each event rule has a single event target that targets the application's ECS cluster. The task uses the same container image that the service uses, and the task's configuration is the same as the service's configuration with the exception of the entrypoint, which is specified by the job configuration's `task_command` setting, which can reference the bucket and path of the file that triggered the event by using the template values `<bucket_name>` and `<object_key>`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
locals { | ||
# Configuration for default jobs to run in every environment. | ||
# See description of `file_upload_jobs` variable in the service module (infra/modules/service/variables.tf) | ||
# for the structure of this configuration object. | ||
# One difference is that `source_bucket` is optional here. If `source_bucket` is not | ||
# specified, then the source bucket will be set to the storage bucket's name | ||
file_upload_jobs = { | ||
# Example job configuration | ||
# etl = { | ||
# path_prefix = "etl/input", | ||
# task_command = ["python", "-m", "flask", "--app", "app.py", "etl", "<object_key>"] | ||
# } | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
locals { | ||
# The prefix key/value pair is used for Terraform Workspaces, which is useful for projects with multiple infrastructure developers. | ||
# By default, Terraform creates a workspace named “default.” If a non-default workspace is not created this prefix will equal “default”, | ||
# if you choose not to use workspaces set this value to "dev" | ||
prefix = terraform.workspace == "default" ? "" : "${terraform.workspace}-" | ||
|
||
bucket_name = "${local.prefix}${var.project_name}-${var.app_name}-${var.environment}" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,7 @@ | ||
variable "project_name" { | ||
type = string | ||
} | ||
|
||
variable "app_name" { | ||
type = string | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
locals { | ||
feature_flags = ["foo", "bar"] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,27 +22,18 @@ data "aws_subnets" "private" { | |
} | ||
|
||
locals { | ||
# The prefix key/value pair is used for Terraform Workspaces, which is useful for projects with multiple infrastructure developers. | ||
# By default, Terraform creates a workspace named “default.” If a non-default workspace is not created this prefix will equal “default”, | ||
# if you choose not to use workspaces set this value to "dev" | ||
prefix = terraform.workspace == "default" ? "" : "${terraform.workspace}-" | ||
|
||
# Add environment specific tags | ||
tags = merge(module.project_config.default_tags, { | ||
environment = var.environment_name | ||
description = "Application resources created in ${var.environment_name} environment" | ||
}) | ||
|
||
service_name = "${local.prefix}${module.app_config.app_name}-${var.environment_name}" | ||
|
||
is_temporary = startswith(terraform.workspace, "t-") | ||
|
||
# Include project name in bucket name since buckets need to be globally unique across AWS | ||
bucket_name = "${local.prefix}${module.project_config.project_name}-${module.app_config.app_name}-${var.environment_name}" | ||
|
||
environment_config = module.app_config.environment_configs[var.environment_name] | ||
service_config = local.environment_config.service_config | ||
database_config = local.environment_config.database_config | ||
storage_config = local.environment_config.storage_config | ||
incident_management_service_integration_config = local.environment_config.incident_management_service_integration | ||
} | ||
|
||
|
@@ -112,7 +103,7 @@ data "aws_security_groups" "aws_services" { | |
|
||
module "service" { | ||
source = "../../modules/service" | ||
service_name = local.service_name | ||
service_name = local.service_config.service_name | ||
image_repository_name = module.app_config.image_repository_name | ||
image_tag = local.image_tag | ||
vpc_id = data.aws_vpc.network.id | ||
|
@@ -125,7 +116,7 @@ module "service" { | |
|
||
aws_services_security_group_id = data.aws_security_groups.aws_services.ids[0] | ||
|
||
is_temporary = local.is_temporary | ||
file_upload_jobs = local.service_config.file_upload_jobs | ||
|
||
db_vars = module.app_config.has_database ? { | ||
security_group_ids = data.aws_rds_cluster.db_cluster[0].vpc_security_group_ids | ||
|
@@ -142,12 +133,14 @@ module "service" { | |
|
||
extra_environment_variables = [ | ||
{ name : "FEATURE_FLAGS_PROJECT", value : module.feature_flags.evidently_project_name }, | ||
{ name : "BUCKET_NAME", value : local.bucket_name } | ||
{ name : "BUCKET_NAME", value : local.storage_config.bucket_name } | ||
] | ||
extra_policies = { | ||
feature_flags_access = module.feature_flags.access_policy_arn, | ||
storage_access = module.storage.access_policy_arn | ||
} | ||
|
||
is_temporary = local.is_temporary | ||
} | ||
|
||
module "monitoring" { | ||
|
@@ -156,18 +149,18 @@ module "monitoring" { | |
#email_alerts_subscription_list = ["[email protected]", "[email protected]"] | ||
|
||
# Module takes service and ALB names to link all alerts with corresponding targets | ||
service_name = local.service_name | ||
service_name = local.service_config.service_name | ||
load_balancer_arn_suffix = module.service.load_balancer_arn_suffix | ||
incident_management_service_integration_url = module.app_config.has_incident_management_service ? data.aws_ssm_parameter.incident_management_service_integration_url[0].value : null | ||
} | ||
|
||
module "feature_flags" { | ||
source = "../../modules/feature-flags" | ||
service_name = local.service_name | ||
service_name = local.service_config.service_name | ||
feature_flags = module.app_config.feature_flags | ||
} | ||
|
||
module "storage" { | ||
source = "../../modules/storage" | ||
name = local.bucket_name | ||
name = local.storage_config.bucket_name | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.