-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Upgrading to 1.4.1 out of memory issue #2902
Comments
Hey @styoo4001, I'm really puzzled as to what could be the cause of that. By chance, is it possible for you to enable logging and send the logs to us?
I.e. we'd be interested in the logs in the very timeframe of the memory spike. |
Alternatively, using the https://github.com/DataDog/ddprof native profiler and sharing the memory profiles with us could also be very interesting! |
@bwoebi |
A new version has already come out, I'm not sure if this log will help, but I've attached it.
|
The new version is not fixing memory growth issues though. There's nothing suspicious inside these logs sadly. Would you be willing to run the native profiler? I'm pretty certain that one will help us more. And tell us which service & what time. |
Seeing the same. Since we updated from 1.3.x to 1.4.x. Also latest 1.5.x does not solve it. Our pods started to get OOM killed after a while, very clearly visible after the upgrade. |
@styoo4001 are you maybe calling I am not sure right now why we added it, but I remember there has been another memory leak in the past where calling reset would fix it. |
I'm currently a bit lost with this as we haven't been seeing unbounded memory growth in our test applications / environments. @TheLevti I'm not sure whether I read your message right - if you remove the |
We have many long running processes/jobs so we had to disable auto flush and auto root span creation. In the past this caused a slow OOM death as we initially could not figure out from where the leak is coming. After running some tracing, we discovered that the dd-trace library is leaking memory for those long running jobs. For us the leak was fixed by calling Now with the new versions in 1.4.1+ we started to get a new memory leak in all our long running processes once we upgraded from 1.3.x. After some debugging we discovered that removing Besides that there is missing documentation about the reset() method and what effects it has. |
We might have messed up something with the disabling&enabling. I'll try whether I can reproduce or spot something weird in 1.4. Thank you for following up. |
Thank you! Just as a note, this might be not related to the author's issue though as I am not sure if he also used ->reset(). Removing ->reset() resolved the leak for us and right now we have no issue. |
Bug report
Bug report
Hello,
Our company has been utilizing DataDog effectively, and we are running services in a Kubernetes environment.
Recently, after deploying a specific API server feature, approximately 30 minutes later, CPU usage significantly spiked, leading to an increase in pod scaling and a sharp rise in memory consumption, causing some pods to crash (out of memory). Notably, the php-fpm process did not exhibit any abnormal behavior.
After rolling back to last week's ArgoCD deployment image, the system stabilized. Initially, we assumed the issue was related to the deployed code, and we thoroughly investigated various aspects. However, we couldn't find any evidence in APM or DataDog profiler. Crucially, the profiler showed that the application's memory usage was below 1GB, but under certain conditions, pod memory usage spiked from 1GB to 8GB within a minute.
Eventually, we discovered that the issue stemmed from the trace version being set to latest when building the deployment image. After changing it to last week's version (1.3.1) and redeploying, the system stabilized.
We confirmed that the issue was resolved following this change.
usual condition image
left : usual condition , right : unusual
My php settings are:
PHP version
8.1.13
Tracer or profiler version
1.4.1
Installed extensions
No response
Output of
phpinfo()
No response
Upgrading from
1.3.1
PHP version
8.1.13
Tracer or profiler version
1.4.1
Installed extensions
No response
Output of
phpinfo()
No response
Upgrading from
No response
The text was updated successfully, but these errors were encountered: