Skip to content

Commit

Permalink
Pipelines section 1 : Changes to Pipelines/Concepts and Pipelines/Arc…
Browse files Browse the repository at this point in the history
…hitecture files (#2307)

* Update overview.md

* Update cloud-auth.md

* Update drift-detection.md

* Update hcl-config-language.md

* Update index.md

* Update components.md

* Update actions.md

* Update security-controls.md

* Update audit-logs.md

* Update audit-logs.md

* Update audit-logs.md

* Update github-workflows.md

* Update usage-data.md

* Update docs/2.0/docs/pipelines/architecture/github-workflows.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update docs/2.0/docs/pipelines/architecture/index.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update docs/2.0/docs/pipelines/architecture/index.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update docs/2.0/docs/pipelines/architecture/index.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update github-workflows.md

* Update index.md

* Update docs/2.0/docs/pipelines/architecture/index.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update docs/2.0/docs/pipelines/architecture/index.md

Co-authored-by: Yousif Akbar <[email protected]>

* Update destroying-infrastructure.md

* Update destroying-infrastructure.md

* Update destroying-infrastructure.md

* Update destroying-infrastructure.md

* Update overview.md

* Update cloud-auth.md

* Update components.md

* Update security-controls.md

* Update audit-logs.md

* Update github-workflows.md

* Update usage-data.md

* Fix pipelines architecture page

* Updates

* Updates

* stuff

* asd

* Fix

---------

Co-authored-by: Yousif Akbar <[email protected]>
Co-authored-by: Zach Goldberg <[email protected]>
  • Loading branch information
3 people authored Jan 27, 2025
1 parent a316024 commit ae5a185
Show file tree
Hide file tree
Showing 11 changed files with 188 additions and 232 deletions.
4 changes: 2 additions & 2 deletions docs/2.0/docs/pipelines/architecture/actions.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ When a pull request is created, Pipelines will automatically execute `terragrunt

When a pull request is merged, Pipelines will automatically execute either `terragrunt apply` or `terragrunt destroy` on every infrastructure change, depending on the type of infrastructure change. For example, if the pull request deletes a `terragrunt.hcl` file, Pipelines will run `terragrunt destroy`.

## Skipping Runs
## Skipping runs

Sometimes you find it necessary to make a change without going through the full pipelines process. This can be accomplished using GitHub's built in method for skipping workflow runs [by adding [skip ci] to your commit message](https://docs.github.com/en/actions/managing-workflow-runs-and-deployments/managing-workflow-runs/skipping-workflow-runs).

## Other Actions
## Other actions

If you'd like to request a new Pipelines action, please email us at [[email protected]](mailto:[email protected]).
37 changes: 21 additions & 16 deletions docs/2.0/docs/pipelines/architecture/audit-logs.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
# Audit logs
# Audit Logs

Gruntwork Pipelines provides an audit log of which GitHub user performed _what_ operation in your AWS accounts as a result of a [Pipelines Action](/2.0/docs/pipelines/architecture/actions.md).
Gruntwork Pipelines provides an audit log that records which GitHub user performed specific operations in your AWS accounts as a result of a [Pipelines Action](/2.0/docs/pipelines/architecture/actions.md).

Accessing AWS environments from a CI/CD system is commonly done by assuming temporary credentials using [OpenID Connect (OIDC)](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services). Usage of shared credentials complicates the task of maintaining an accurate log of who did what in your AWS accounts. Gruntwork Pipelines solves this issue by leveraging [AWS CloudTrail](https://aws.amazon.com/cloudtrail/) and a naming scheme based on context from the triggering Pipelines Action in GitHub. This setup associates every single API operation performed by Gruntwork Pipelines with a GitHub username and a pull request or branch. This allows your security team to quickly triage access related issues and perform analytics on usage patterns of individual users from your version control system in your AWS accounts.
Accessing AWS environments from a CI/CD system often involves assuming temporary credentials using [OpenID Connect (OIDC)](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services). Shared credentials can complicate tracking who performed specific actions in AWS accounts. Gruntwork Pipelines addresses this challenge by using [AWS CloudTrail](https://aws.amazon.com/cloudtrail/) with a naming convention that includes context from the triggering Pipelines Action in GitHub. This approach associates every API operation performed by Pipelines with a GitHub username and a specific pull request or branch, enabling your security team to efficiently investigate access-related issues and analyze individual user actions.

## How it works

Gruntwork Pipelines provides an audit log of which user performed what action in which account. To accomplish this, Pipelines sets the [AWS STS](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) session name using a combination of the initiating GitHub user, the name of Pipelines itself, and the pull request or branch from which the action was triggered. All logging for Gruntwork Pipelines AWS access is uses [AWS CloudTrail](https://aws.amazon.com/cloudtrail/). Session names are set in the `User name` field in CloudTrail, allowing those searching the data to clearly identify which user performed an action. For more information on querying the logs see [where you can find logs](#where-you-can-find-logs) and [querying data](#querying-data).
Gruntwork Pipelines creates an audit log that tracks which user performed what action in which AWS account. It does this by setting the [AWS STS](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) session name to include the initiating GitHub username, the Pipelines name, and the pull request or branch that triggered the action. Logging is handled through [AWS CloudTrail](https://aws.amazon.com/cloudtrail/), where session names appear in the `User name` field, making it easy to identify which user performed an action. For information on locating logs, see [where you can find logs](#where-you-can-find-logs) and [querying data](#querying-data).

### What gets logged

Logs are available for all operations performed in every AWS account by Gruntwork Pipelines. Gruntwork Pipelines takes advantage of [AWS STS](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) session names to clearly label all sessions with the GitHub user who requested the change and the pull request or branch that triggered the change.
Logs are generated for all operations performed by Gruntwork Pipelines across every AWS account. These logs leverage [AWS STS](https://docs.aws.amazon.com/STS/latest/APIReference/welcome.html) session names to clearly label sessions with the GitHub username that requested the change and the associated pull request or branch.

The `userIdentity` field in each CloudTrail event associated with API calls performed by Pipelines [Actions](/2.0/docs/pipelines/architecture/actions.md) will contain the session name. For example, if the GitHub user `SomeUserInYourOrg` made a pull request that was the 123rd pull request in your repository, an associated CloudTrail event would have information that looked like the following in the `userIdentity` field.
Each CloudTrail event linked to API calls from Pipelines [Actions](/2.0/docs/pipelines/architecture/actions.md) includes the session name in the `userIdentity` field. For example, if the GitHub user `SomeUserInYourOrg` initiated the 123rd pull request in your repository, the `userIdentity` field in a corresponding CloudTrail event would provide details such as the following.

```json
{
Expand Down Expand Up @@ -45,26 +45,31 @@ The `userIdentity` field in each CloudTrail event associated with API calls perf
}
```

Due to the 64 character limit for STS session names, we do not include the originating repository. To determine the originating repo, you can search repositories for commits containing the user and branch/PR-number combination, or reach out to the GitHub user directly.
Due to the 64-character limit for STS session names, the originating repository is not included in the session name. To identify the originating repository, you can search through repositories for commits matching the user and branch/PR-number combination or contact the GitHub user directly.

Combined with a [query service](#querying-data), you can use data from CloudTrail to perform analytics on usage of Gruntwork Pipelines in your AWS accounts.
By combining this data with a [query service](#querying-data), you can analyze the usage patterns of Gruntwork Pipelines in your AWS accounts using CloudTrail logs.

### Who gets logged

Pipelines uses a naming scheme that combines the GitHub user who triggered the Pipelines [Action](/2.0/docs/pipelines/architecture/actions.md) and the pull request or branch from which the action was initiated. Pipelines will set the AWS STS session name according to the following format: `<GitHubUserName>-via-GWPipelines@(PR-<PullRequestNumber>|<branch name>)`.
Pipelines employs a naming scheme that integrates the GitHub user who triggered the Pipelines [Action](/2.0/docs/pipelines/architecture/actions.md) along with the pull request or branch that initiated the action. The AWS STS session name is formatted as follows:
`<GitHubUserName>-via-GWPipelines@(PR-<PullRequestNumber>|<branch name>)`.

Pipelines runs Pipelines Actions when a pull request is opened, updated, re-opened, or merged to the deploy branch (e.g., main). The naming scheme will use different values for pull request events and pushes to the deploy branch (e.g. merged PRs).
#### For pull request events
When Pipelines runs in response to a pull request event (opened, updated, or reopened), the session name includes the user who made the most recent commit on the branch and the pull request number assigned by GitHub. For instance:
- If the user `SomeUserInYourOrg` created pull request number `123`, the session name would be:
`SomeUserInYourOrg-via-GWPipelines@PR-123`.

When Pipelines is run in response to a pull request event, the user that created the most recent commit on the branch will be used in the log, in addition to the pull request number assigned by GitHub. For example, if the user `SomeUserInYourOrg` created a pull request that was given the number `123` by GitHub, the resulting session name would be `SomeUserInYourOrg-via-GWPipelines@PR-123`.

When Pipelines is run in response to a pull request being merged, the user that performed the merge (e.g., the user who clicked the `merge` button on the pull request) will be used in the log in addition the to deploy branch (e.g., `main`). For example, if the user `SomeUserInYourOrg` merged a pull request to your deploy branch named `main` the session name would be `SomeUserInYourOrg-via-GWPipelines@main`
#### For merged pull requests
When Pipelines runs after a pull request is merged, the session name reflects the user who performed the merge (e.g., clicked the `merge` button) and the deploy branch name (e.g., `main`). For example:
- If the user `SomeUserInYourOrg` merged a pull request to the branch `main`, the session name would be:
`SomeUserInYourOrg-via-GWPipelines@main`.

## Where you can find logs

Gruntwork Pipelines leverages AWS CloudTrail to log all actions taken by Pipelines in your AWS account. Due to our naming scheme, identifying operations performed in your AWS account by Gruntwork Pipelines are clearly identified.
Gruntwork Pipelines uses AWS CloudTrail to log all actions performed in your AWS accounts. Thanks to the naming scheme, actions initiated by Gruntwork Pipelines are clearly identified in CloudTrail.

Accessing CloudTrail and querying data is dependent on your organization's policies and settings. If you are a Gruntwork Account Factory customer, see the documentation on [logging](/2.0/docs/accountfactory/architecture/logging) for information on how to access and query your CloudTrail data.
Accessing and querying CloudTrail data depends on your organization's specific policies and configurations. If you are a Gruntwork Account Factory customer, refer to the documentation on [logging](/2.0/docs/accountfactory/architecture/logging) for details on accessing and querying CloudTrail data.

## Querying data

CloudTrail can be configured to automatically store events in an S3 bucket of your choosing. Storing data in S3, when correctly configured, allows you to limit access to only the users that need it. Data in stored in S3 can be searched directly using a query service like [Amazon Athena](https://aws.amazon.com/athena/) or forwarded to other logging systems.
CloudTrail can be configured to automatically store events in an S3 bucket of your choice. When stored in S3 with proper configuration, access can be restricted to authorized users only. You can query stored data using services such as [Amazon Athena](https://aws.amazon.com/athena/) or forward the data to other logging systems for further analysis.
18 changes: 9 additions & 9 deletions docs/2.0/docs/pipelines/architecture/components.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
# Components Architecture

There are two major components in pipelines - the orchestrator and the executor. The orchestrator determines the jobs to be executed and the executor actually executes the jobs and makes any necessary updates in AWS.
Pipelines consists of two main components: the orchestrator and the executor. The orchestrator identifies necessary jobs, while the executor performs those tasks and updates AWS resources accordingly.

## Orchestrator

The orchestrator identifies each infrastructure change in a pull request or git commit, classifies the type of infrastructure change it is (e.g. `AccountsAdded`, `ModuleChanged`, `EnvCommonChanged`), and determines the right pipelines actions (e.g., `terragrunt plan`, `apply`, or `destroy`) to run based on that infrastructure change.
The orchestrator analyzes each infrastructure change in a pull request or git commit, categorizes the type of change (e.g., `AccountsAdded`, `ModuleChanged`, `EnvCommonChanged`), and identifies the appropriate pipelines actions (e.g., `terragrunt plan`, `apply`, or `destroy`) to execute based on the type of change.

## Executor

The executor takes a pipeline action and infra change as input then performs the action on the infra change. In some cases, like responding to `AccountsAdded` events on merge, the executor will create a subsequent pull request in the `infrastructure-live-root` repository with additional IaC code to baseline the account(s) that were just added.
The executor receives a pipeline action and an infrastructure change as input and executes the specified action on the change. For example, when responding to `AccountsAdded` events on merge, the executor may create a follow-up pull request in the `infrastructure-live-root` repository to include additional IaC code for baselining the newly added accounts.

## Execution Flow
## Execution flow

Pipelines starts with an event in GitHub - a pull request being created, updated, or re-opened, and a PR being merged to `main` (or any other push to `main`). From there, the orchestrator determines the infra-change set and appropriate action to run for each infra-change. For each change in the infra-change set, the executor runs the appropriate action. The executor then logs the results of the action back in GitHub on the pull request that initiated the flow.
Pipelines begins with an event in GitHub, such as the creation, update, or reopening of a pull request, or a push to `main` (e.g., merging a pull request). The orchestrator determines the set of infrastructure changes (`infra-change set`) and selects the appropriate action for each change. For every change in the set, the executor performs the necessary action and logs the results in GitHub, attaching them to the pull request that triggered the workflow.

## Trust Boundaries
## Trust boundaries

Vital to the architecture of Pipelines is understanding of the trust model inherent to the system. Pipelines is designed to be run in a CI/CD system, which means that it has privileged access to your AWS accounts.
A critical aspect of Pipelines' architecture is understanding its trust model. Since Pipelines runs within a CI/CD system, it has privileged access to your AWS accounts.

One principle to keep in mind is that anyone with trust to edit code on the `main` branch of your repositories is trusted to make the corresponding changes in your AWS accounts. This is why we recommend that you follow the [Repository Access](/2.0/docs/pipelines/installation/viamachineusers#repository-access) guidelines to ensure that only the right people have the right access to your repositories.
Anyone with the ability to edit code in the `main` branch of your repositories inherently has the authority to make corresponding changes in your AWS accounts. For this reason, it is important to follow the [Repository Access](/2.0/docs/pipelines/installation/viamachineusers#repository-access) guidelines to ensure appropriate access control.

In addition, each AWS IAM role provisioned as part of DevOps Foundations only trusts a single repository (and for apply roles, only a single branch). If you find that you are expanding the permissions of a given role too wide, you should consider creating a new role that has more granular permissions for the specific use-case. Utilize the `infrastructure-live-access-control` repository to support that need.
Additionally, each AWS IAM role provisioned through DevOps Foundations is configured to trust a single repository (and, for apply roles, a single branch). If a role's permissions become overly broad, consider creating a new role with more granular permissions tailored to the specific use case. Use the `infrastructure-live-access-control` repository to define and manage these roles.
Loading

0 comments on commit ae5a185

Please sign in to comment.