Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy Bootstrap and add Metrics #38

Merged
merged 6 commits into from
May 1, 2024

Conversation

gregcusack
Copy link
Collaborator

@gregcusack gregcusack commented Apr 29, 2024

Summary of Changes

  1. Deploy the bootstrap validator in a kubernetes pod
  2. Add services for so other pods can contact the bootstrap
  3. Add metrics

A series of PRs that will build out the monogon testing framework for deploying validator clusters on Kubernetes

@gregcusack gregcusack force-pushed the deploy-bootstrap-rs-v2 branch from 9a55b94 to f6c17c9 Compare April 29, 2024 18:58
@gregcusack gregcusack requested review from yihau and joncinque April 30, 2024 16:06
Copy link

@joncinque joncinque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just small things to address

```
cd scripts/
./init-metrics -c <database-name> <metrics-username>
# enter password when promted

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit typo

Suggested change
# enter password when promted
# enter password when prompted

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely doesn't need to be done in this PR, but would it be easier for maintenance for this to be done through a separate little rust binary in this repo?

Copy link
Collaborator Author

@gregcusack gregcusack May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya i can do that. what is the benefit? Just less bash code? and since it doesn't change we can just use a binary?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was wondering where we draw the line between RIIR and Keep It In Bash (KIIB??)... This could definitely stay as a bash script, which is why my comment was a question. Feel free to disregard!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh yes fair enough! i do like the idea of putting it in rust just to keep it somewhat uniform.

Comment on lines +507 to +513
let metrics_secret = match kub_controller.create_metrics_secret() {
Ok(secret) => secret,
Err(err) => {
error!("Failed to create metrics secret! {err}");
return;
}
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if you have the main function call a function returning a Result<(), Box<dyn std::error::Error>>, and printing the error, most of these can be reduced with ?s, ie

Suggested change
let metrics_secret = match kub_controller.create_metrics_secret() {
Ok(secret) => secret,
Err(err) => {
error!("Failed to create metrics secret! {err}");
return;
}
};
let metrics_secret = kub_controller.create_metrics_secret()?;
info!("something good happened"!);

"replica set: {} not ready...",
bootstrap_validator.replica_set_name()
);
std::thread::sleep(std::time::Duration::from_secs(1));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge deal here since you're not doing anything concurrently, but if you're in async world, and you sleep the whole thread, that thread can't pick up any other work, so instead you should do an async sleep with tokio::time::sleep https://docs.rs/tokio/latest/tokio/time/fn.sleep.html

Comment on lines +621 to +629
while {
match kub_controller
.check_replica_set_ready(bootstrap_validator.replica_set_name().as_str())
.await
{
Ok(ok) => !ok, // Continue the loop if replica set is not ready: Ok(false)
Err(_) => panic!("Error occurred while checking replica set readiness"),
}
} {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: As another point for making this function return a Result, with ? this can be come much more readable as:

Suggested change
while {
match kub_controller
.check_replica_set_ready(bootstrap_validator.replica_set_name().as_str())
.await
{
Ok(ok) => !ok, // Continue the loop if replica set is not ready: Ok(false)
Err(_) => panic!("Error occurred while checking replica set readiness"),
}
} {
while !kub_controller
.check_replica_set_ready(bootstrap_validator.replica_set_name().as_str())
.await? {

Comment on lines +267 to +270
pub async fn check_replica_set_ready(
&self,
replica_set_name: &str,
) -> Result<bool, kube::Error> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

micro-nit: the word check doesn't make it clear what this returns, how about changing this to is_replica_set_ready? Otherwise, you can have this return an enum with Ready and NotReady as the states, to force people to match on it

@gregcusack gregcusack merged commit da24112 into anza-xyz:main May 1, 2024
1 check passed
gregcusack added a commit to gregcusack/validator-lab that referenced this pull request May 1, 2024
gregcusack added a commit that referenced this pull request May 6, 2024
* address jon nits from #38

* chido comment: refactor known validator from: #10

* rewrite init-metrics.sh in rust

* Create non-bootstrap, voting validator accounts

* build and push validator docker image

* create and deploy validator secret

* add validator selectors

* create validator replica sets. need shred_version

* add in get shred version from genesis

* deploy validator replica set

* deploy validator service

* refactor buildtype skip. will skip release channel pull/extract as well
@gregcusack gregcusack deleted the deploy-bootstrap-rs-v2 branch May 24, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants