-
Notifications
You must be signed in to change notification settings - Fork 40.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spring Boot v3.4.0 causes our staging & production environment to hang and time out #43332
Comments
I think the critical information here is the state of the JVM threads. This would let us know what's preventing the app from serving requests. Can you capture this information and let us know? |
Hi @bclozel, Thank you for responding as quickly as you did. I created an endpoint that would return a thread dump like this:
Immediately after the application started on CloudRun, I was able to collect the first thread dump here: I can hit the unauthenticated endpoint "/" an unlimited number of times without any issues. After I authenticate with Spring Authentication Server 1.4.0 and try to hit a secure endpoint, the server becomes completely unresponsive and I am no longer able to catch a thread dump and Cloud Run times out the connection after 300 seconds. At this point, the unauthenticated endpoint "/" is no longer responsive regardless if I hit it unauthenticated or authenticated. Looks like this applies to our other Spring Boot instances as well... the moment I am authenticated with SAS and then try to call any endpoint... 💥☠️ Server hangs. There is nothing in any of the instance logs, including SAS, that show any errors or any looping or anything out of the ordinary. Here's a screenshot of what all the instance logs look like after : I don't see anything out of the ordinary with SAS v1.4.0 release notes in comparison to the SAS version Spring Boot 3.3.6 depends on: I also don't see anything that should concern us with the latest Spring Data MongoDb release: The app that is unresponsive, still logs MongoDb pings as you can see on this screenshot [as if everything is honky dory]: Let me know how else I can help here. It's not easy capturing a thread dump with Cloud Run when the app is in this, or any, state. |
It looks like something is blocking threads or some memory leak/infinite recursion. The first thread capture doesn’t show anything related to Spring. Maybe you are using Java agents or instrumentation libraries that are not compatible with the latest Spring version? |
There is unfortunately no way to do a After your last remark I am leaning towards Sentry being the culprit. spring-io/start.spring.io#1647 Will continue my investigation there. Close issue at will. Cheers Brian 🍻 |
I don't think we can track the source of the problem without a snapshot of the java threads when the app is having issues. This could come from any library on your classpath, anny java agent or incompatibility with a remote resource. I haven't seen anything so far pointing to Spring Boot causing issues; we can reopen this issue if we find new information. I'm not familiar enough with Google Cloud run but not being able to connect to the JVM in any way is quite limiting. Maybe is there a way to configure the instance to open a port and connect a profiler to the running JVM? Closing this issue for now. |
Our services work on all Spring Boot versions prior to v3.4.0 and have been for years. v3.4.0 works in our dev environment and we are unable to reproduce what is occurring in staging and production.
We have 4 Spring Boot apps running on GCP
Here is what happens once the deployment hits staging / production:
GCP metrics (response times goes to 300 seconds and times out after release)
MongoDB console (Activity decreases after release)
Cloud Run startup log that shows Spring Boot starting up error free and goes directly to timing out
Maven dependency tree:
dependencies.txt.zip
The next step for us would be to turn on logging to high to see anything interesting shows up. Just spent a Saturday afternoon rolling back from production and trying to figure out where it was coming from. At this moment, I am completely clueless.
Any help would be appreciated.
We deploy Spring Boot with
mvn spring-boot:build-image
and the plugin config looks like this:Development environment:
Mac OS Sequoia
Atlas CLI
MongoDb 8
Java 23
Providers:
MongoDb Atlas
MongoDb 8
Google Cloud Platform / Cloud run
Java 23
The text was updated successfully, but these errors were encountered: