- Table of Contents
- About
- Prerequisites
- Module Inputs
- Resources
- How to Run
- Post-Run Steps
- Known Issues
This directory contains a Terraform module that deploys the infrastructure resources necessary to run the EPA's Drupal 8 WebCMS. Due to security constraints across various AWS environments, this module assumes it will be provided certain prerequisites before it can be deployed (see the next section).
This module assumes it will be deployed into a pre-existing VPC and will be the only module deployed into that VPC. The VPC must have, at a minimum, two types of subnets (public and private), and security groups for each of the deployed resources. We maintain a reference module that shows an example set up, as well as the outputs it is expected to create (see Parameter Store under "Module Inputs" below).
This module does not automatically provision TLS certificates. Since they are required for TLS traffic with the load balancer, these must be created beforehand and made accessible in either AWS Certificate Manager (ACM) or AWS IAM. Record the ARN(s) of any certificates to associate with the load balancer and provide them to the module.
See the parent directory's README for instructions on using a backend for remote state and locking.
This module expects inputs from two sources: Terraform variables and Parameter Store. The values in Parameter Store are not expected to change often (if at all), and are kept here (instead of in variables) so that they can be generated by build automation rather than having to be manually curated in a .tfvars
file or script.
- Provider variables
aws_region
: This tells the provider which region this module is being deployed in.
- Naming variables
iam_prefix
: The name prefix of all IAM resources. This defaults to the string"WebCMS"
but can be changed if a different convention is desired (or required).
- Module variables
environment
: The name of the environment this module covers. The word "environment" here means something like pre-production or production. A single environment may hold multiple sites (indeed, it will always have at least two: one each for English and Spanish).sites
: Alist(string)
of sites this environment will be running. This is most likely going to be["dev", "stage"]
for a pre-production environment, and["prod"]
for a production environment.tags
: Amap(string)
of tags to apply to all AWS resources created by this module.
- Load balancer variables
lb_default_certificate
: The ARN of a certificate (in IAM or ACM) to attach to the load balancer.lb_extra_certificates
: An optional list of additional certificate ARNs to attach to the load balancer. This supports adding extra TLS authentication while not disrupting requests being sent to the NLB.lb_internal
: Whether or not this environment's NLB is internal. This should betrue
in ordinary AWS deployments, but can befalse
if there is some other public ingress to the WebCMS.
- RDS & Aurora variables
db_instance_type
: A supported instance type for the Aurora reader and writer instances.db_instance_count
: Number of reader/writer instances. In a production environment, this should be greater than one in order to support fast failover.regional_cluster_endpoint
: The regional Aurora endpoint that the application will point to
- Elasticsearch variables
search_instance_type
: The instance type of the Elasticsearch data nodes.search_instance_count
: The number of data nodes to deploy into the Elasticsearch cluster.search_instance_storage
: Since storage in Amazon's Elasticsearch implementation does not autoscale, the value must be determined in advance. See Sizing Amazon ES Domains in the AWS documentation for more info on storage requirements and limits.search_dedicated_node_type
: The instance type to use for dedicated master nodes. If this feature is not being used (i.e., the count of nodes will be zero), set this to the empty string.search_dedicated_node_count
: The number of dedicated master nodes to deploy into this cluster. The AWS documentation recommends this for improved stability; it is likely to be unnecessary in a preproduction environment.search_availability_zones
: Set this to a value above 1 to enable Multi-AZ support for the Amazon ES domain.
- Elasticache variables
cache_instance_type
: The instance type of the ElastiCache memcached nodes to deploy. See the list of supported node types and the node size guide in the AWS docs.cache_instance_count
: The number of memcached nodes to deploy. In a production environment, this should be greater than 1 in order to improve the cache's resiliency in the case of outages. This value may be freely changed without reconfiguring the WebCMS task; we use ElastiCache's auto-discovery feature.
Parameters are namespaced under /webcms/${var.environment}/
- we make use of Parameter Store's hierarchy in order to make it easier to provide least-privilege access to build automation (an IAM policy can restrict ssm:GetParameter
to a wildcard).
- From the VPC, the module reads three parameters under the
/webcms/${var.environment}/vpc/
path:/webcms/${var.environment}/vpc/id
: The ID of the VPC. This is used to identify the VPC for the load balancer's target groups./webcms/${var.environment}/vpc/public-subnets
: A comma-separated list of the IDs (sg-<hex>
) of the VPC's public subnets. We provide these to the load balancer since it's public-facing./webcms/${var.environment}/vpc/private-subnets
: A list of subnet IDs corresponding to the VPC's private subnets. Most resources are deployed here./webcms/${var.environment}/vpc/public-cidrs
: A comma-separated list of the CIDRs (e.g.,10.0.0.0/24
) of the VPC's public subnets. The Traefik router uses these for respecting the load balancer's PROXY protocol headers./webcms/${var.environment}/vpc/private-cidrs
: A list of subnet IDs corresponding to the VPC's private CIDRs.
- The module also depends on pre-existing security groups.
/webcms/${var.environment}/security-groups/database
: The security group for the Aurora cluster/webcms/${var.environment}/security-groups/proxy
: The security group for the RDS proxy/webcms/${var.environment}/security-groups/elasticsearch
: The security group for the Elasticsearch domain/webcms/${var.environment}/security-groups/memcached
: The security group for the ElastiCache cluster/webcms/${var.environment}/security-groups/alb
: The security group for the ALB/webcms/${var.environment}/security-groups/drupal
: The security group for the Drupal ECS tasks/webcms/${var.environment}/security-groups/terraform-database
: The security group for the database initialization task
The majority of the resources created by this module are deployments of AWS services consumed by the Drupal 8 WebCMS. Most of the resources in this category are fairly straightforward:
- A pair of load balancers: an NLB and its ALB target (load_balancer.tf)
- An Aurora cluster (rds.tf)
- An Elasticache cluster (cache.tf)
- An Elasticsearch domain (search.tf)
- An ECS cluster (cluster.tf)
Special mention is due to two files. We use an RDS proxy (proxy.tf) to manage connection pooling externally. Due to PHP's request-scoped nature, database connections are difficult to persist. Instead, we rely on the proxy to handle connections to the Aurora clutser from multiple containers as well as mitigate transient connection failures due to, e.g., failover during reader-to-write promotion.
Resources in this category are not systems that the WebCMS connects to (or receives connections from), but instead are somewhat lower-level elements that support external data. For example:
- CloudWatch log groups (logging.tf)
- Secrets Manager secrets (secrets.tf)
- ECR repositories (ecr.tf)
- An EventBridge cron schecule (cron.tf)
We mirror public images from the Docker Hub in AWS. This meets two use cases:
- Fargate tasks do not have arbirary outbound HTTPS access in some AWS environments, so pulling directly from the Docker Hub is not supported, and
- Security requirements dictate that all images be made available for static analysis.
The simplest way to meet these requirements is to simply copy images from the Docker Hub into ECR.
We are in the process of migrating from Traefik to an ALB. In order to ease migration, both the legacy Traefik service and the new ALB are deployed by this module. The Traefik resources will be removed in a later update.
This module can be run in any environment that has permission to modify the relevant AWS account's infrastructrure resources.
Before proceeding with deploying Drupal, be sure that users, databases, grants, and passwords have been set up appropriately for the environment.
There are a number of sensitive values that must be populated by an administrator before the WebCMS can be deployed. We list them below. Note that all of the secrets here are simple strings and not JSON-formatted. If updating these values in the AWS console, ensure that you are editing "Plaintext" instead of "Secret key/value".
- The Drupal hash salt must be generated and saved in
/webcms/${var.environment}/${site}/${lang}/drupal-hash-salt
. This value must differ between site/language combinations in order to prevent one-time tokens such as password resets from being reused. Use a secure random number generator such as theopenssl rand
utility to generate a large number of bytes (at least 32). - For email, the SMTP password must be saved in the secret
/webcms/${var.environment}/${site}/${lang}/mail-password
. - An x509 certificate and private key will need to be generated for each Drupal site and language. The private key needs to be set in
/webcms/${var.environment}/${site}/${lang}/saml-sp-key
.
-
Initial deployments may fail due to a race condition with the load balancer and the Traefik service. Terraform will attempt to register the service with ECS before the listeners are fully ready.
The race condition is transient; a second
terraform apply
will successfully deploy Traefik. -
Similarly, this race condition can to the ALB target as well. Is is also transient and another apply will resolve the issue.
-
If you haven't created a Secrets Manager secret before in this region, the lookup of the KMS alias
alias/aws/secretsmanager
will fail. The solution is to create a throwaway secret and delete it immediately. The default KMS key will be created automatically, and the Terraform run will proceed. -
Due to hashicorp/terraform-provider-aws#17010, we cannot upgrade the provider from 3.21.0 until the underlying API issue is resolved.