Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: enable additional metadata collection (under feature flag) #33232

Merged
merged 9 commits into from
Feb 1, 2025

Conversation

GavinZZ
Copy link
Contributor

@GavinZZ GavinZZ commented Jan 30, 2025

Issue # (if applicable)

Closes #33260

Relevant discussion #33198

Note

The majority of the code changes are auto generated so you'll see hundreds of addConstructMetadata method call across different L2 resources.

This method comes from this change https://github.com/aws/aws-cdk/pull/33232/files#diff-81f821b1205e7040fc3103bf7c0114060a6d5c43ebd2994aa4ed5906e42c9c5fR33. The main code change that needs to be reviewed is in packages/aws-cdk-lib/core as well as tools/@aws-cdk/construct-metadata-updater

Reason for this change

This discussion aims to expand the scope of usage data collected by the AWS CDK to better inform CDK development and improve communication for customer-impacting topics. Currently, for those that opt in, the CDK collects usage data on your CDK version and which L2 constructs you use.

Description of changes

  1. Update CDK synthesis code to additionally handle resource metadata. On feature flag set to true, synthesis will not only inject Metadata usage like version and construct name, it will additionally look for any construct/method/feature flag metadata injected during resource creation. On feature flag set to false, it should be the same as before.
  2. One-time tool metadata-updater to automatically find the right classes and add import statements and add metadata statements. The tool can be run multiple times and should not add additional import or add metadata statements to files that already been added. An action item is to link the tool to a GHA to periodically run this.
  3. Build a workflow (that will be linked to GHA in the future) and when redacting, redact based on the value. The workflow on run will parse all files in aws-cdk repository and built a JSON file that contains all constructs and loggable properties of the construct. When redacting, only log the property key if the key exists in the JSON file. The value will be logged only if the value is not a * in the JSON file. Everything else is redacted for safely.
  4. Build a JSON blueprint of the ENUM values and do not redact ENUM values.

Consider the following example

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3 from 'aws-cdk-lib/aws-s3';

class MyStack extends cdk.Stack {
  constructor(scope, id, props) {
    super(scope, id, props);

    // Create an S3 bucket (L2 construct)
    const myBucket = new s3.Bucket(this, 'MyBucket', {
      bucketName: 'my-cdk-example-bucket', // String type
      versioned: true,                    // Boolean type
      removalPolicy: cdk.RemovalPolicy.DESTROY, // ENUM type
      lifecycleRules: [{                  // Array of object type
        expirationDate: new Date('2019-10-01'),
        objectSizeLessThan: 600,
        objectSizeGreaterThan: 500,
      }],
    });

    // Use a method of the L2 construct to define additional properties
    myBucket.addLifecycleRule({
      id: 'ExpireOldObjects',
      enabled: true, // Boolean
      expiration: cdk.Duration.days(90), // Expire objects after 90 days
    });
  }
}

// Define the CDK app and stack
const app = new cdk.App();
new MyStack(app, 'MyStack');
app.synth();

At synthesis, usage data is collected, compressed, and stored in the AWS::CDK::Metadata resource. Based on current behavior, the following is an example of the usage data that will be collected from our example app:

{ 
    "fqn": "aws-cdk-lib.aws-s3.Bucket", 
    "version": "v2.170.0" 
}

With this proposal, the following usage data will be collected. The * value replaces property values that will be redacted from data collection:

{ 
    "fqn": "aws-cdk-lib.aws_s3.Bucket", 
    "version": "2.170.0", 
    "metadata": [ 
        { 
            "type": "aws:cdk:analytics:construct", 
            "data": { 
                "bucketName": "*", 
                "versioned": true, 
                "removalPolicy": "cdk.RemovalPolicy.DESTROY", 
                "lifecycleRules": [ 
                    { 
                        "expirationDate": "*", 
                        "objectSizeLessThan": "*", 
                        "objectSizeGreaterThan": "*" 
                    } 
                ] 
            } 
        }, 
        { 
            "type": "aws:cdk:analytics:method", 
            "data": { 
                "name": "addLifecycleRule", 
                "prop": { 
                    "id": "*", 
                    "enabled": true, 
                    "expiration": "*",
                } 
            } 
        } 
    ] 
}

Describe any new or updated permissions being added

No

Description of how you validated changes

Many new unit tests added to verify different behaviour of various functions and methods introduced. One integ test file is added to test the deployability.

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@GavinZZ GavinZZ requested a review from a team as a code owner January 30, 2025 03:14
@aws-cdk-automation aws-cdk-automation requested a review from a team January 30, 2025 03:14
@github-actions github-actions bot added the p2 label Jan 30, 2025
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Jan 30, 2025
@aws-cdk-automation aws-cdk-automation added the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Jan 30, 2025
@moelasmar
Copy link
Contributor

are you going to add the GH workflow in a separate PR ?

@GavinZZ
Copy link
Contributor Author

GavinZZ commented Jan 31, 2025

are you going to add the GH workflow in a separate PR ?

Yes, it's going to be a separate PR.

Copy link
Contributor

@xazhao xazhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall.

@GavinZZ GavinZZ added pr-linter/exempt-codecov The PR linter will not require codecov checks to pass and removed pr-linter/exempt-codecov The PR linter will not require codecov checks to pass labels Jan 31, 2025
Copy link
Contributor

mergify bot commented Jan 31, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation aws-cdk-automation removed the pr/needs-maintainer-review This PR needs a review from a Core Team Member label Jan 31, 2025
Copy link
Contributor

mergify bot commented Jan 31, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@GavinZZ
Copy link
Contributor Author

GavinZZ commented Jan 31, 2025

@mergify update

Copy link
Contributor

mergify bot commented Jan 31, 2025

update

✅ Branch has been successfully updated

@github-actions github-actions bot added feature-request A feature should be added or improved. p1 and removed p2 labels Jan 31, 2025
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 1451998
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

mergify bot commented Jan 31, 2025

Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@GavinZZ GavinZZ merged commit 6b9e47a into main Feb 1, 2025
19 of 20 checks passed
@GavinZZ GavinZZ deleted the yuanhaoz/feat/metadata-collection-final branch February 1, 2025 00:02
Copy link

github-actions bot commented Feb 1, 2025

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 1, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
contribution/core This is a PR that came from AWS. feature-request A feature should be added or improved. p1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CDK Collecting Additional Metadata
4 participants