-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The last versions of Nginx Unit fails in the TechEmpower benchmark #794
Comments
Looking at the log it seems
Unit was started with some initial config.
After it was started it was reconfigured, then shutdown. All that happened within the space of a second. From your point of view, what was the actual sequence of events? |
The strange thing is that in local work. Finish here and start to verify the tests: When finish the tests and close the docker: But not in github actions or the bench server. Github actions log failing: |
It's failing from the last 2 versions of Nginx Unit. Here are the logs from an old run, using And an actual, using A recent run that half worked, using |
When I send the PR for Nginx Unit 1.28 was passing the github actions tests. TechEmpower/FrameworkBenchmarks#7691 Github actions log working: |
As discussed on the community slack https://nginxcommunity.slack.com/archives/C02SS9UB85C/p1670008983875989?thread_ts=1669810946.221959&cid=C02SS9UB85C This issue is related to #728 and friends. A manual patch of the Linking the external issue as well. TechEmpower/FrameworkBenchmarks#7757 (review) As said, we will ship a fix for the entrypoint script with the next Unit release 1.29. |
If work in the next run. I'll close the issue. |
I will close it once the fix was merged if that's okay for you |
Yes, please share the logs. The new images should have the new docker entrypoint script. |
It's using the patch in v1.28, and looks like the same issue. I'll update to v1.29 |
Failed in the last run, using v1.29. Sunit (scala) from @lolgab, also fail with v1.28 and the last commit with v1.29. The only framework using v1.28 and working is Fastapi (python): |
It's the same issue. It is not necessary to open a new one. |
Update, Fastapi (python) fail with v1.29 in the tests. |
This issue is different from the other once. It is not related to the control socket. Will have a look on it. |
This log has these lines in it:
Which seems like it's the same issue as outlined in #815 perhaps? |
Yes indeed! This is exactly this issue. We debugged the Python code last night. 1.29 is working with Python 3.10.8. anything newer than this will currently fail. We are working on a fix and I will let you know once available. Thanks for the links. |
Any progress on this issue? |
@joanhey Thanks. We will have a look into this. More details on Monday. |
My Scala Native benchmark using |
Work but with a lot of errors. Run logs: |
With unit v1.31.0 also fail in the GH Actions tests in the bench. As now also file in the test, it's easier to test it isolated to fix the problem. |
What is failing exactly? |
I mean the only errors I'm seeing for php-unit are a bunch of socket connection errors
What is |
The last one is only a GH actions test: I think the problem is that Unit needs too much time to initialize all the forks(processes|workers). |
Before Unit was working without problems in the benchmark. |
Are you saying that Unit itself is failing somehow due to trying to start X amount of processes? What is X in your case? Do you have a simple reproducer config? |
It's a benchmark that use 3 xeon servers, each with 24 cores. We test the same PHP code with different runtimes and servers. You can download and run the tests locally. Before pass the GH action tests and fail only in the bench. |
I'm just trying to understand exactly what it is that's failing... Let me try it another way.
|
I send it the link, but I pass it here: {
"listeners": {
"*:8080": {
"pass": "applications/benchmark"
}
},
"applications": {
"benchmark": {
"type": "php",
"processes": 84,
"user": "www-data",
"group": "www-data",
"root": "/php/",
"options": {
"file": "/php/deploy/conf/php.ini"
},
"limits": {
"requests": 10000000
}
}
}
} |
And when try to start: And it was working without problems, till Unit v1.29. |
OK, so the issue started with 1.29.0... If it was after that then I would maybe point to b0e2d9d ("Isolation: Switch to fork(2) & unshare(2) on Linux."), but it started failing before that landed. Looking at the changes between 1.28.0 and 1.29.0, there's nothing that really stands out here. There were a couple of PHP module updates, 48b6a7b ("PHP: Fixed php_module_startup() call for PHP 8.2.") and a032744 ("PHP: allowed to specify URLs without a trailing '/'.") The only obvious process related changes was the introduction of per application cgroup support, but you're not even using that so it should basically have no impact. The thing I find confusing about the logs is it always looks like unit is shutdown before the benchmark even runs. Does it work if you reduce the number of processes? As a quick test, I was able to start 84 PHP application processes in < 1 second on a 2 vCPU VM.
|
Today I have some free time, and I try to pinpoint this issue. The problem is that it needs too much time to start "Unit", and it isn't ready when start the verification test or any other CI command. Actual unit docker build process
The No
|
Yes, that matches my previous comment where you can see starting Unit with a load load of PHP processes took < 1 second... so I suspect the problem is not Unit itself but around the machinery to start it... I don't know what |
The The log for We can also run the tests in local, and fail because use too much time with ~+70 processes. |
See previous discussion: #1114 The entrypoint script works well for simple configurations/applications but does hurt container startup time. The reason for the For my own Unit-based Docker images I take a similar approach with COPY unitconf.json init.json
RUN nohup /bin/sh -c "unitd --no-daemon --pid init.pid --log /dev/stdout --control unix:init.unit.sock &" && \
curl -fsX PUT [email protected] --unix-socket init.unit.sock _/config && \
rm init.* I have a proposal for |
@joanhey A workaround you can try is to have "processes": {
"max": 84,
"spare": 1
} What do you think? |
Yes, I tried that without a solution. In local I moved the config to the entrypoint, and it's worst. I tried locally to update to PHP8.4, and it is NOT possible, fail completely. I tried to help, but I send issues for the last 2 years, without any solution or they don't want see the problem. |
Please anybody from Unit core developers, can update Unit to PHP/8.4 ?? PD: not with 10 processes !! |
Before Unit was faster than Nginx server. |
Yes, but fail with each try to update the benchmark. I tested 20 combinations, using the entrypoint, less processes, ... and never work. If anybody from the core want to try, here it's my last dockerfile try: FROM unit:php8.4
RUN docker-php-ext-install pdo_mysql opcache > /dev/null
WORKDIR /php
COPY --link . .
RUN if [ $(nproc) = 2 ]; then sed -i "s|\"spare\": 168,|\"spare\": 64,|g" /php/deploy/nginx-unit.json ; fi;
#RUN more /php/deploy/nginx-unit.json
COPY deploy/nginx-unit.json /docker-entrypoint.d/
#ENTRYPOINT [ ]
EXPOSE 8080
#CMD ["unitd", "--no-daemon", "--control", "unix:/var/run/control.unit.sock"] The last time with PHP8.3 I needed to configure first and bypass the entrypoint. |
Hi @joanhey
It's not that we don't want to see the problem... but the problem seems to be related to the whole cycle of starting unit/configuring it/stopping it/starting it again... That is certainly less than ideal, but can be easily avoided (by for example, and this is just one possibility, using a pre-populated state dir, yes, people will say that's not supported or whatever, but in reality it'll work most of the time). Looking back at some of the recent comments In #794 (comment) I showed that Unit has no problem starting many php application processes in < 1 second. Then in #794 (comment) you yourself say
So, don't use the entrypoint.sh thing which is known to have issues, seems to be the solution... |
Hi,
I added Nginx and Nginx Unit to the TechEmpower benchmark.
But in the last versions it's failing with PHP:
https://tfb-status.techempower.com/unzip/results.2022-11-25-12-16-16-174.zip/results/20221119184855/php-unit/run/php-unit.log
Old config without problems, and faster than Nginx:
https://github.com/TechEmpower/FrameworkBenchmarks/blob/R20/frameworks/PHP/php/php-unit.dockerfile
https://github.com/TechEmpower/FrameworkBenchmarks/blob/R20/frameworks/PHP/php/deploy/nginx-unit.json
Actual config with problems:
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/PHP/php/php-unit.dockerfile
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/PHP/php/deploy/nginx-unit.json
The last versions fail a lot.
Please, could anybody review and help with this problem.
Thank you.
PD: Welcome help with the Nginx config too, only with the plaintext test.
https://github.com/TechEmpower/FrameworkBenchmarks/blob/master/frameworks/C/nginx/nginx.conf
Be careful, as the bench changed a lot after the spectre/meldown patches.
TechEmpower/FrameworkBenchmarks#7321
If you need I'll send you the benchs results. And ask me for any doubt.
The text was updated successfully, but these errors were encountered: