-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux VM compiled from git commit 0d7eba4a or later fails on fetching updates from source.squeak.org #696
Comments
Hi Dave, I found and fixed a bad regression last week in eem.3471. What’s the version you’re using that fails? (output of squeak -version)_,,,^..^,,,_ (phone)On Nov 26, 2024, at 2:27 PM, David T Lewis ***@***.***> wrote:
Last good commit: 1af9a9b (HEAD) CogVM source as per VMMaker.oscog-eem.3424
First bad commit: 0d7eba4 CogVM source as per VMMaker.oscog-eem.3444
Steps to reproduce:
Compile Linux VM from 0d7eba4 or later
Run Squeak trunk (updated to latest with Monticello-dtl.813), fetch updates from any repository
Result: ConnectionClosed: Connection closed while waiting for data.
Note: There are no intervening commits between 1af9a9b and 0d7eba4, so the issue is presumed related to VMMaker changes rather than platform code changes.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Hi Eliot, my locally compiled VM from latest git pull does still have the issue, version info is: Virtual Machine/usr/local/lib/squeak/5.0-202411252058-64bit/squeak Image/home/lewis/squeak/Squeak6.0/squeak.13.image |
Here is a summary of additional test results: Symptoms: SocketStream test has many failures and timeouts. Socket tests has failures and also crashed the VM. Opening a repository on source.squeak.org fails. Updating Squeak from the update stream fails. The issue is apparently related to both compiler and Slang code generation. With a compiler that exposes the problem, the issue appears first in commit 0d7eba4 "CogVM source as per VMMaker.oscog-eem.3444," and the symptoms do not appear to change in any later commits. The last good commit prior to that was 1af9a9b (HEAD) "CogVM source as per VMMaker.oscog-eem.3424", and the differences between these appear to be primarily related to Slang code generation. I retested this on a much older Linux computer (thankfully rescued just in time from the recycle bin), and the issue does NOT appear there. I also have confirmation from Bruce O'Neel that he has been doing opensmalltalk-vm builds on Linux and has not seen any of the issues reported here. Finally, I tried changing the gcc optimization level from -O2 to -O0, and this makes the problem go away. The system I am using has an AMD processor and the following version information: $ cat /proc/version $ gcc --version $ spur64 -version More to follow, with the above information I hope be able to track something down in gcc. |
I compiled the Cog HEAD revision (squeak.cog.spur) on a legacy MacOS 12.7.6 and got the same behavior, impossible to connect thru SSL. If compiler optimization level makes a difference, then it's most probably a sign that the generated code invoke UB. |
Recognizing that the issue is apparently related to C undefined behavior, and also associated with CCodeGenerator code generation changes, I used a VMMaker image to generate the code for VMMaker versions from VMMaker.oscog-eem.3424 through VMMaker.oscog-eem.3444. I can confirm that the issue is introduced in VMMaker.oscog-eem.3444. Source generated from VMMaker.oscog-eem.3443 (into ./src/spur64.cog/ ) does not exhibit the issue, and code generated from VMMaker.oscog-eem.3444 exhibits the issue (in both cases on my system with -O2 compiler optimization). So the issue is introduced in VMMaker.oscog-eem.3444, 23-Aug-2024 "Rewrite the Slang transpiler's parse tree and inliner". This is a large VMMaker commit so we are still looking for a needle in a haystack, but I think the haystack may be a bit smaller now. I note that Eliot specifically asked for review and criticism in that commit, so please consider this as a much belated review :-) |
@nicolas-cellier-aka-nice can you please say what compiler (and version level of compiler) you have on your legacy MacOS 12.7.6? I am not familiar with the Mac environment, but a compiler bug is not out of the question. I have gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 on my system, and the bug is present when compiling with -O1 or higher with generated VM sources from VMMaker.oscog-eem.3444 and above. But other compilers (including those used for our GitHub actions builds) do not show any problem at all. |
Addition information, working with a (possibly bad?) gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 compiler: The issue is introduced in VMMaker.oscog-eem.3444. Later fixes in VMM have no effect on the observed symptoms. The problem goes away with gcc optimization turned off ( -O0). The problem is present in both the stack VM and the Cog VM. The issue is only in the main VM module lib/squeak/5.0-202408232148-64bit/squeak, as opposed to the plugins and VM modules. I confirmed this by compiling with almost all plugins external, and copying individual compiled files into the last known good build in lib/squeak/5.0-202407312233-64bit/. No other files (including SocketPlugin.so) cause a problem, only the main VM module is at issue. The VMMaker.oscog-eem.3444 generated sources (in commit 0d7eba4) produce over 90 additional compiler warnings in the ./vm build, mainly associated with function pointer assignments. After hand editing the generated source files to address the warnings, the problem still exists, so I see no evidence that these warnings are pointing to C undefined behavior issues. |
I used native makefile for mac OS which I think rely on CC=clang as defined in ./building/macos64x64/common/Makefile.rules
For me, the problem disappeared with commit cfd1161 based on VMMaker.oscog-eem.3475. Since potentially each and every operation on signed integer is subject to undefined behavior (or almost every), the C compiler won't warn you about it, but for the most suspicious cases. A possibility is to instrument the generated code to detect UB at run time, at least with clang |
Last good commit: 1af9a9b (HEAD) CogVM source as per VMMaker.oscog-eem.3424
First bad commit: 0d7eba4 CogVM source as per VMMaker.oscog-eem.3444
Steps to reproduce:
Note: There are no intervening commits between 1af9a9b and 0d7eba4, so the issue is presumed related to VMMaker changes rather than platform code changes.
The text was updated successfully, but these errors were encountered: