-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregator by Default #2886
Aggregator by Default #2886
Conversation
|
GitGuardian id | Secret | Commit | Filename | |
---|---|---|---|---|
7414 | Google API Key | fb28d67 | cost-analyzer/templates/_helpers.tpl | View secret |
7414 | Google API Key | 3c2979e | cost-analyzer/templates/query-service-deployment-template.yaml | View secret |
7414 | Google API Key | c9e8622 | cost-analyzer/templates/_helpers.tpl | View secret |
7414 | Google API Key | c9e8622 | cost-analyzer/templates/aggregator-statefulset.yaml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
Please also remember to check/update the files required for CI processes: |
a152057
to
22e1cd2
Compare
I'm still working out some things on the backend code which is causing chart tests to fail. While I'm working through testing, @chipzoller are you willing to give this PR a review while it is in a draft state? I'd like some feedback from your Helm expertise on things in here, particularly my use of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These helper template changes seem extremely over-the-top. What problem are you looking to solve here that you think isn't appropriate for the templates?
There are two containers, Aggregator and Cloud Cost, which must be deployable either within the cost-analyzer Pod or under their own controllers (a StatefulSet and Deployment, respectively). The design is diagrammed out in this doc if you'd like to have a look. I'm using the templates |
Looking good so far! Fills me with joy to see such a high deletion to addition ratio in a PR. While you're in here performing this overhaul, what are your thoughts on more thorough automated testing of the helm chart? I know we perform linting, but I feel it would be useful to have the ability to run automated validations on the chart with a wide array of values, in order to verify template correctness for a variety of install configurations. |
@nik-kc, I have already set this up and it is being done on all PRs, including deployment and basic e2e smoke tests on real EKS and OpenShift clusters. See here. @michaelmdresser, ok I understand the goals. I will give this a good look soon. I think the idea of creating a standard container template and stamping it out in the necessary Pod controller form factor is a good practice to reduce/eliminate drift. The challenge is ensuring that stays universal enough so controller-specific configs don't creep in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so much tech debt paid, bravo. that dead postgres config search and destroy was 💯 . I don't think the reliance on helpers.tpl is too onerous, especially considering it buys us something approaching 1k lines of avoided code duplication?
c5b447b
to
5cc809e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this LGTM at this point. Will leave it to @chipzoller to approve after his review since he said he wanted to put eyes on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Michael, I've given this a good look through (yet have not verified the end results myself...will wait for final review), and you've done some really nice work here. I know how much time this took. Thank you.
I have added some relatively minor comments, requests, and questions throughout, but here are some of the more general thoughts/requests:
-
Also, assuming you have not done so, we'd appreciate it if you go through all the values files present in this repo and ensure they're in good shape based on these changes. I know you did that for the CI files (thanks!) but please check all others if not done.
-
Looks like you're doing it, but because I haven't done a thorough survey, please just double-check and make sure you've removed all defined templates corresponding to excised sections of the chart (ex., query service and postgres).
-
We need to include annotations and labels defined under
global.podAnnotations
andglobal.additionalLabels
. We're terribly inconsistent about that, but "global" means "everywhere" and not selectively. -
Any new fields added into the values file we (and, specifically, users) would appreciate comments for every one explaining what they do. We will eventually use this to build docs from this values file (a later initiative).
-
etlBucketConfigSecret
is not defined invalues.yaml
yet is still referenced by templates in this PR implying it's still necessary. Kindly document.
QSR was an optimization attempt made before Aggregator for high-scale environments. Aggregator's existence obsoletes QSR, so for this next release of Kubecost which changes the default install config we should remove QSR completely. Signed-off-by: Michael Dresser <[email protected]>
With the advent of Aggregator as the only query handler, there is no more need for the Federator component to combine federated bingen files. Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Postgres as a code path has been obsolete for years. It is not supported. Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Kubecost's Aggregator component is now the officially-supported method for highly-available query serving. It has the ability to be split into its own StatefulSet for managing multiple replica situations. The cost-analyzer Pod is now dedicated almost wholly to building metric data and thus no longer needs the highly-complex leader/follower configuration. Discussed with Bolt, Ajay, Alex Signed-off-by: Michael Dresser <[email protected]>
Because of file-based data sharing in the default free config, running Kubecost fully in-memory cannot be supported. Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Moved Aggregator container config to helpers so it can be shared by the cost-analyzer Pod and the dedicated StatefulSet. Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
5c180ae
to
6876caa
Compare
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
Signed-off-by: Michael Dresser <[email protected]>
@chipzoller Thank you for the thorough review, it's exactly what I was looking for. I've addressed as much as I could and left questions on the rest of the feedback. Please let me know if I've misinterpreted any of your ideas. |
Chart build is failing across all instances. |
@chipzoller The failing tests are failing because https://github.com/kubecost/kubecost-cost-model/pull/2002 is required for this Helm change to work. Without specifically setting values, the Helm chart uses a v1.108 backend image which does not function in this configuration of the Helm chart, causing all tests to fail. I believe it is still reviewable. |
Michael, I think the best place to accept this PR is in the 2.0-helm-changes branch we have been working on since this is a 2.0 change. There have already been significant changes to the chart reflected in that branch. Can you rebase on top of it and merge there instead? |
Closing in favor of #2898 which targets |
What does this PR change?
Does this PR relate to any other PRs?
How does this PR impact users? (This is the kind of thing that goes in release notes!)
federatedETL.primaryCluster
value has been removed. In an Enterprise Federated ETL configuration all cost-model containers should behave as if they were secondaries. Aggregator now serves combined queries. If you are an Enterprise Federated ETL user, follow this guide TODO for necessary configuration changes.federatedETL.federator.enabled
) has been removed. In an Enterprise Federated ETL configuration all queries and data-combination is handled by Aggregator. If you are an Enterprise Federated ETL user, follow this guide TODO for necessary configuration changes.kubecostModel.warmSavingsCache
value has been removed. If you need to disable the savings cache in Aggregator, set the env varSAVINGS_ENABLED
tofalse
.Links to Issues or tickets this PR addresses or fixes
What risks are associated with merging this PR? What is required to fully test this PR?
How was this PR tested?
Have you made an update to documentation? If so, please provide the corresponding PR.
WIP