Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure Load Testing in Staging #1702

Open
4 tasks
halprin opened this issue Jan 14, 2025 · 1 comment
Open
4 tasks

Azure Load Testing in Staging #1702

halprin opened this issue Jan 14, 2025 · 1 comment
Labels
devex/opex A development excellence or operational excellence backlog item.

Comments

@halprin
Copy link
Member

halprin commented Jan 14, 2025

DevEx/OpEx

Implement Azure load testing in our Staging environment.

We have already implemented Azure load testing in our Internal environment. We want to duplicate this in Staging. You can read how to do this in the README.md. The CDC subscription has already opted into the Locust load test preview.

This builds upon #1122.

Tasks

Additional Context

These load tests seem to cause the service to generate 499 HTTP status codes. As best as I can tell, that status code means that when TI finished processing the HTTP request and was about to send back the HTTP response, it realized that Locust had already closed the connection. So, even if TI was going to send back a 200 HTTP status, TI couldn't send that HTTP response and therefore the HTTP status is registered as a 499 in Azure's system. Locust reports zero 4xx HTTP statuses on its end.

Is this a problem? I'm leaning towards no because Locust thinks everything is fine, and I don't think it's a problem that Locust decides to end an HTTP request early.

There's a task in this ticket to look into this some more.

@halprin halprin added the devex/opex A development excellence or operational excellence backlog item. label Jan 14, 2025
@halprin
Copy link
Member Author

halprin commented Jan 14, 2025

Apparently we are blocked from creating load tests in the CDC domain (where the Staging environment is located). I tried using my -SU account. It looks like there's a policy deny on the entire subscription.

I tried load testing Staging from our Flexion domain, but I get the 403 HTTP status on everything which is the same as if I was trying to interact with it via Postman.

We could put an allow in the "firewall" of our web app the same way we allow ourselves when we need to triage something. I looked up the IP CIDRs of Azure, searched for load testing, and I'm presented with 24 separate CIDR blocks. I don't think we really want to manually add (and then delete!) 24 CIDR blocks from the networking configuration of the web app.

At this point I see two options. There could be more! I just only see two at the moment.

  1. Script adding and removing the CIDR blocks as an allow to the networking configuration of our web app. This would be run right before and after a load test that is ran from our Flexion domain.
  2. Talk to the powers that be about allow load tests be created in the CDC domain. I'm not hopeful that this option will work. I lost the fight last time a permission was randomly taken away, and our allies couldn't persuade the powers that be either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devex/opex A development excellence or operational excellence backlog item.
Projects
None yet
Development

No branches or pull requests

1 participant