Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created Metrics job #379

Closed

Conversation

samarpita-bhaumik
Copy link
Contributor

Create openshift manifest for Metrics job or cronjob #269

Signed-off-by: Samarpita Bhaumik <[email protected]>
@Gregory-Pereira Gregory-Pereira added deployment Issue to track production and QA deployment related work. metrics Related to telemetry labels Dec 2, 2024
@Gregory-Pereira
Copy link
Collaborator

Can we convert this to a draft? The implementation should include the resources to build the image itself and to deploy it. Happy to sync on this if there is any confusion here.

@samarpita-bhaumik
Copy link
Contributor Author

@Gregory-Pereira can you please elaborate, on the implementation of which you are talking about? I will make the changes and update the PR.

@Gregory-Pereira
Copy link
Collaborator

Applogies for the late response, I would love to help explain this! Particularly what I am reffering to is that this implementation is theoretical at this point, see code:

            image: your-image:tag
            command: ["/bin/sh", "-c", "your-metrics-command"]

We dont currently have the image for gathering the metrics, we will need to create a container image that will do this for us. In addition to this, its typically considered best practice to include other manifests to properly scope the permissions of the job / cronjob in the form of a serviceAccount, Role or clusterRole, and roleBinding or clusterRoleBinding (however if you want we can always implement this in later PRs, and run the job with permissions attributed to the native default SA that runs all default workloads. So to recap, we need to:

  1. Create the container image via a containerfile that we will run in the openshift cluster to send these metrics somewhere
  2. Create the CI workflow that will publish this container image to our GHCR.io and Quay.io registries
  3. Move the metrics cronjob that you implemented to the correct path. I believe this should be the QA manifests deployment path, since we deploy production and we do not expect users to do this themselves, we don't need to collect deployment metrics there - those metrics will be built into the TS code (see issue). However A strong arguement could also be made to move these to the base manfiests so they get applied in both prod or QA even if it is a little overkill.
  4. (OPTIONAL) Create a Service account for the job, and accurately scoped RBAC. On Openshift, there are some settings you have to check if that cluster has telemetry enabled otherwise its illegal to send those metrics. These checks plus RBAC have been configured into the securesign's segment backup job which is why I suggested going this direction, but any implementation that checks for those values is fine by me.

I know there is a lot here, but this is meant as background context to help. We can start piece by piece, and perhaps I should move this to an epic and break it down into more granular pieces. I will consult @vishnoianil and @nerdalert on scoping this accurately

@Gregory-Pereira
Copy link
Collaborator

Actually lets hold on the segment implementation. I would like to have the metrics from deployment and the metrics from the UI code (#281) go to the same place. Ill talk with Oleg and figure out which service we want to go with.

@Gregory-Pereira
Copy link
Collaborator

Im creating the base metrics deployment in #391, so we can pick this up as soon as that merges.

@Gregory-Pereira
Copy link
Collaborator

Hi @samarpita-bhaumik , I am sorry to say but the community has decided to not track instances of self-deploying the stack as it feels like that is more of a product related concern than an upstream one (feel free to read my full description on the issue thread). As such I don't think were going to implement the k8s job manifests.

Instead were going to focus on just tracking Metrics for our QA and Prod deployments. If you still wanted to be involved with the metrics we are looking for help with #433 , where we are going to first gather a list of all the metrics we want to track, and then implement what doesn't come out of the box into the typescript codebase. Let me know if you are interested so I can start breaking it out into smaller feature issues.

@samarpita-bhaumik
Copy link
Contributor Author

@Gregory-Pereira yes I am interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployment Issue to track production and QA deployment related work. metrics Related to telemetry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants