-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP/DP Split: write configuration to agent #2999
base: change/control-data-plane-split
Are you sure you want to change the base?
CP/DP Split: write configuration to agent #2999
Conversation
This commit adds functionality to send nginx configuration to the agent. It also adds support for the single nginx Deployment to be scaled, and send configuration to all replicas. This requires tracking all Subscriptions for a particular deployment, and receiving all responses from those replicas to determine the status to write to the Gateway. Right now we do not watch for Pod creation events for the nginx Deployment, so when a Pod first starts up, it will not receive any configuration. Only when the config changes will the nginx Pods get an update.
return fmt.Errorf("failed to register nginx deployment %q", nsName.Name) | ||
} | ||
|
||
// TODO(sberman): wait to send config until Deployment pods have all connected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to wait for all pods to connect? Can't we just send the config to all the pods that are running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If no pods are running/connected, but we have a deployment, then we should wait until they're connected (or timeout).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels backwards to me. An agent Subscribe/Connect (or combo I can't remember) should trigger a config apply. We know exactly when an agent is ready to receive config, we shouldn't have to wait for it. I think building in waits is going to lead to issues down the line.
What if we building and storing the config was done separately from updating the config? When a new agent subscribes, we grab the latest config stored for that deployment and send it to the agent. When a new config is built, we send a message on a channel saying "Config updated" and the listeners will grab the config and push to the agents. Status can be handled completely async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sort of had this built in originally, but if we get a connection and then grab the latest config, how does status get applied based on that outcome? The handler
that processes a batch, builds, and sets status isn't involved in this application of the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm just hung up on this approach because it's the route I took when I was working with agent v2: 4c090c6#diff-d4bd8c3fac26361875c036e64559f3d3a96b95cd03729ccdd1cd6fe675277b81.
However, there are undoubtably other approaches and my approach might not be the best one.
I think as long as we can avoid the problem you stated where we wait forever for a pod to come up then it doesn't matter which approach we take.
The status updater requires the Graph, which only the handler has. Unless we're now storing the graph on the Deployment object as well?
Most of the status can be generated before we apply the configuration. So we could generate the status with the graph in the handler and then store it alongside the config. Once we apply the config, we can update the status to reflect the outcome of the config apply. Just an idea.
In my original implementation, the Subscriber got the current config from the DeploymentStore (no need to contact the broadcaster) and then once it applied it, it started its listen loop. I think this works, but there's still the pain point of the Subscriber having to update statuses. It would need the Graph and it would have to ensure it doesn't overwrite any previous error statuses that may have occurred.
I think we can solve the status problem with my suggestion above, but I haven't looked into it too deeply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the status can be generated before we apply the configuration. So we could generate the status with the graph in the handler and then store it alongside the config. Once we apply the config, we can update the status to reflect the outcome of the config apply. Just an idea.
Yeah that's an interesting thought...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the Subscriber writes status, it's going to fall into the same trap as the leader election problem, where multiple entities are wanting to write status.
I like having the subscriber grab the current config when it first connects and applying it. But I can't nail down how to write that status...one idea is the handler has a goroutine where it listens for Subscribers to send it a status to write. It would need to keep track of each Subscription's status and get the Gateway's current status to ensure that it doesn't overwrite any errors that were written by other Subscriptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the Subscriber writes status, it's going to fall into the same trap as the leader election problem, where multiple entities are wanting to write status.
Agreed, you need one thread writing status that reads off a channel.
It would need to keep track of each Subscription's status and get the Gateway's current status to ensure that it doesn't overwrite any errors that were written by other Subscriptions.
This is the tricky bit. Is the Gateway the only resource we need to worry about overwriting status on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the tricky bit. Is the Gateway the only resource we need to worry about overwriting status on?
Yes I believe so. Just the Programmed condition. For both Gateway and Listener conditions.
This commit adds functionality to send nginx configuration to the agent. It also adds support for the single nginx Deployment to be scaled, and send configuration to all replicas. This requires tracking all Subscriptions for a particular deployment, and receiving all responses from those replicas to determine the status to write to the Gateway.
Right now we do not watch for Pod creation events for the nginx Deployment, so when a Pod first starts up, it will not receive any configuration. Only when the config changes will the nginx Pods get an update.
Testing: Created Gateway API resources and saw the configuration applied to the nginx deployment. Scaled nginx, made more changes, and saw those applied as well.
Closes #2842