-
Notifications
You must be signed in to change notification settings - Fork 261
[CI] LTP: Add MAP_PRIVATE patch to LTP sources and enable LTP in SGX CI #2187
base: master
Are you sure you want to change the base?
Conversation
f002c3c
to
95d34af
Compare
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
a discussion (no related file):
Really nice PR!
a discussion (no related file):
If you're interested, that's how this PR fails on Jenkins:
+ make SGX=1 ltp-sgx.xml
./runltp_xml.py -c ltp.cfg -c ltp-sgx.cfg -c ltp-bug-1075.cfg /home/jenkins/workspace/graphene-sgx-18.04-apps/LibOS/shim/test/ltp//install/runtest/syscalls -O ltp-sgx.xml
Cancelling nested steps due to timeout
Sending interrupt signal to process
Terminated
Makefile:68: recipe for target 'ltp-sgx.xml' failed
.ci/lib/stage-test-sgx.jenkinsfile, line 2 at r1 (raw file):
stage('test-sgx') { timeout(time: 15, unit: 'MINUTES') {
It looks like 15 minutes is not enough for our wimpy Jenkins machines (they are really-really slow 2-physical-cores SGX machines with 128MB of EPC).
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r1 (raw file):
cd LibOS/shim/test/ltp # To run ltp tests with GSGX and work around the mmap issue,
We don't use GSGX
in our repo, let's just change to SGX
LibOS/shim/test/ltp/ltp.cfg, line 2026 at r1 (raw file):
[sendfile06_64] skip = yes
Could you remove this redundant newline?
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 28 at r1 (raw file):
skip = yes [chmod03]
Why did you add this test, it seems to have worked before?
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 31 at r1 (raw file):
skip = yes [chmod04]
Why did you add this test, it seems to have worked before?
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 79 at r1 (raw file):
skip = yes [epoll_wait01]
Why did you add this test, it seems to have worked before?
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 82 at r1 (raw file):
skip = yes [epoll_wait03]
Why did you add this test, it seems to have worked before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
.ci/lib/stage-test-sgx.jenkinsfile, line 2 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
It looks like 15 minutes is not enough for our wimpy Jenkins machines (they are really-really slow 2-physical-cores SGX machines with 128MB of EPC).
I suggest to start with say 30 minutes, and then (if it succeeds), decrease in intervals of 5 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @jinengandhi-intel)
.ci/lib/stage-test-sgx.jenkinsfile, line 2 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I suggest to start with say 30 minutes, and then (if it succeeds), decrease in intervals of 5 minutes.
At least on my machine it worked within 15mins, will increase it to 30 to be on the safe side.
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
We don't use
GSGX
in our repo, let's just change toSGX
Done
LibOS/shim/test/ltp/ltp.cfg, line 2026 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Could you remove this redundant newline?
Done
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 28 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you add this test, it seems to have worked before?
At least on our servers it has never worked with SGX:
SGX=1 ./pal_loader chmod03
WARNING: Using insecure argv source. Don't use this configuration in production!
Accessing file:etc/nsswitch.conf is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
Accessing file:/lib/x86_64-linux-gnu/libnss_files.so.2 is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
chmod03 1 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/chmod/chmod03.c:151: getpwnam failed: errno=EACCES(13): Permission denied
chmod03 2 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/chmod/chmod03.c:151: Remaining cases broken
But instead of adding to this file where we track the mmap failures, I have moved the skip to the ltp-sgx.cfg.
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 31 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you add this test, it seems to have worked before?
Same as above, the test has always failed for us.
SGX=1 ./pal_loader chmod04
WARNING: Using insecure argv source. Don't use this configuration in production!
Accessing file:etc/nsswitch.conf is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
Accessing file:/lib/x86_64-linux-gnu/libnss_files.so.2 is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
chmod04 1 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/tst_sig.c:234: unexpected signal SIGSEGV(11) received (pid = 1).
chmod04 2 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/tst_sig.c:234: Remaining cases broken
chmod04 0 TWARN : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/tst_tmpdir.c:319: tst_rmdir: TESTDIR was NULL; no removal attempted
Have moved the skip to ltp-sgx.cfg as this is not related to mmap issue.
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 79 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you add this test, it seems to have worked before?
This test originally failed with the mmap shared error:
SGX=1 ./pal_loader epoll_wait01
WARNING: Using insecure argv source. Don't use this configuration in production!
file_map does not currently support writable pass-through mappings on SGX. You may add the PAL_PROT_WRITECOPY (MAP_PRIVATE) flag to your file mapping to keep the writes inside the enclave but they won't be reflected outside of the enclave.
/home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/tst_test.c:108: TBROK: mmap((nil),4096,3,1,3,0) failed: EACCES (13)
After the fix, it fails with the below error which I need to analyze:
SGX=1 ./pal_loader epoll_wait01
WARNING: Using insecure argv source. Don't use this configuration in production!
/home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/tst_test.c:1250: TINFO: Timeout per run is 0h 05m 00s
/home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/lib/safe_macros.c:261: TBROK: /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/epoll_wait/epoll_wait01.c:43: read(3,0x3cbac80,45056) failed, returned 4096: EINVAL (22)
Again moving the skip statement to ltp-sgx.cfg.
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 82 at r1 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Why did you add this test, it seems to have worked before?
The test does work now but not all subtests pass:
SGX=1 ./pal_loader epoll_wait03
WARNING: Using insecure argv source. Don't use this configuration in production!
epoll_wait03 1 TPASS : epoll_wait() fails as expected: TEST_ERRNO=EBADF(9): Bad file descriptor
epoll_wait03 2 TPASS : epoll_wait() fails as expected: TEST_ERRNO=EINVAL(22): Invalid argument
epoll_wait03 3 TPASS : epoll_wait() fails as expected: TEST_ERRNO=EINVAL(22): Invalid argument
epoll_wait03 4 TPASS : epoll_wait() fails as expected: TEST_ERRNO=EINVAL(22): Invalid argument
epoll_wait03 5 TFAIL : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/epoll_wait/epoll_wait03.c:129: epoll_wait() succeed unexpectedly
I want to analyze the failing test first, post which I will create a must-pass struct for this.
We have more of such tests that we plan to cover as part of another PR. For now, I will let the skip statement be in the ltp-sgx.cfg
a9db9e4
to
1781a1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
LibOS/shim/test/ltp/ltp-bug-1075.cfg, line 28 at r1 (raw file):
Previously, jinengandhi-intel wrote…
At least on our servers it has never worked with SGX:
SGX=1 ./pal_loader chmod03
WARNING: Using insecure argv source. Don't use this configuration in production!
Accessing file:etc/nsswitch.conf is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
Accessing file:/lib/x86_64-linux-gnu/libnss_files.so.2 is denied (Operation denied). This file is not trusted or allowed. Trusted files should be regular files (seekable).
chmod03 1 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/chmod/chmod03.c:151: getpwnam failed: errno=EACCES(13): Permission denied
chmod03 2 TBROK : /home/rasp/jinen_workpsace/graphene/LibOS/shim/test/ltp/src/testcases/kernel/syscalls/chmod/chmod03.c:151: Remaining cases brokenBut instead of adding to this file where we track the mmap failures, I have moved the skip to the ltp-sgx.cfg.
Ok, thanks, I agree with moving these tests to SGX-specific file.
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
If you're interested, that's how this PR fails on Jenkins:
+ make SGX=1 ltp-sgx.xml ./runltp_xml.py -c ltp.cfg -c ltp-sgx.cfg -c ltp-bug-1075.cfg /home/jenkins/workspace/graphene-sgx-18.04-apps/LibOS/shim/test/ltp//install/runtest/syscalls -O ltp-sgx.xml Cancelling nested steps due to timeout Sending interrupt signal to process Terminated Makefile:68: recipe for target 'ltp-sgx.xml' failed
Ok, it got better with a 30-minute timeout. The tests finish but one test fails:
./runltp_xml.py -c ltp.cfg -c ltp-sgx.cfg -c ltp-bug-1075.cfg /home/jenkins/workspace/graphene-sgx-apps/LibOS/shim/test/ltp//install/runtest/syscalls -O ltp-sgx.xml
2021-02-24 08:02:36,922 LTP.setrlimit01: -> ERROR (Timed out after 45.0 s.)
2021-02-24 08:06:20,870 LTP: LTP finished tests=1327 passed=308 failures=0 errors=1 skipped=1018 returncode=1
Is there anything special about setrlimit01
? Are you sure this test works under SGX?
Also, in another run kill11
failed under SGX, but I think this is a known intermittent issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Ok, it got better with a 30-minute timeout. The tests finish but one test fails:
./runltp_xml.py -c ltp.cfg -c ltp-sgx.cfg -c ltp-bug-1075.cfg /home/jenkins/workspace/graphene-sgx-apps/LibOS/shim/test/ltp//install/runtest/syscalls -O ltp-sgx.xml 2021-02-24 08:02:36,922 LTP.setrlimit01: -> ERROR (Timed out after 45.0 s.) 2021-02-24 08:06:20,870 LTP: LTP finished tests=1327 passed=308 failures=0 errors=1 skipped=1018 returncode=1
Is there anything special about
setrlimit01
? Are you sure this test works under SGX?Also, in another run
kill11
failed under SGX, but I think this is a known intermittent issue.
Thanks Dmitrii for doing the analysis! Not sure if it is one of those inconsistent tests, it had passed on our server for sure, will try to run the test in a loop and check the output.
1781a1a
to
b1b465d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 4 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
Thanks Dmitrii for doing the analysis! Not sure if it is one of those inconsistent tests, it had passed on our server for sure, will try to run the test in a loop and check the output.
For now have marked the tests as skipped. Can someone trigger a new build? Hopefully no more test failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
For now have marked the tests as skipped. Can someone trigger a new build? Hopefully no more test failures.
I will trigger Jenkins now.
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
I will trigger Jenkins now.
Looks like it's working but there is this error during Jenkins cleanup:
ERROR: Files modified by build, but not gitignored:
--------------------------------------------------------------------------------
M LibOS/shim/test/ltp/src
Entering 'LibOS/shim/test/ltp/src'
M lib/tst_test.c
Entering 'LibOS/shim/test/ltp/src/testcases/kernel/mce-test'
This is because you modify lib/tst_test.c
manually. You should add it to gitignore: https://github.com/oscarlab/graphene/blob/master/LibOS/shim/test/ltp/.gitignore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel)
a discussion (no related file):
Hm, even 30 minutes is not enough. Please note that 30-minute limit includes building SGX-enabled LTP tests as well as running them. Are you sure 15 minutes that you reported previously included both these steps?
Anyway, please increase the limit to say 45 minutes. That's getting sad though :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv)
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Looks like it's working but there is this error during Jenkins cleanup:
ERROR: Files modified by build, but not gitignored: -------------------------------------------------------------------------------- M LibOS/shim/test/ltp/src Entering 'LibOS/shim/test/ltp/src' M lib/tst_test.c Entering 'LibOS/shim/test/ltp/src/testcases/kernel/mce-test'
This is because you modify
lib/tst_test.c
manually. You should add it to gitignore: https://github.com/oscarlab/graphene/blob/master/LibOS/shim/test/ltp/.gitignore
Will add it.
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Hm, even 30 minutes is not enough. Please note that 30-minute limit includes building SGX-enabled LTP tests as well as running them. Are you sure 15 minutes that you reported previously included both these steps?
Anyway, please increase the limit to say 45 minutes. That's getting sad though :(
What's the difference between graphene-sgx-apps and graphene-sgx-18.04-apps jobs? One full run of the former job completed in 36 minutes (including the LTP tests passing with GSX), whereas here with the latter it timed out just running the LTP tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @jinengandhi-intel)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
What's the difference between graphene-sgx-apps and graphene-sgx-18.04-apps jobs? One full run of the former job completed in 36 minutes (including the LTP tests passing with GSX), whereas here with the latter it timed out just running the LTP tests.
Our Jenkins machines are quite different in terms of performance, from what I know (some have more CPUs, some have less; some have better SSD drives, some have worse HDD drives). So it could be that one of these jobs landed on a wimpy machine, whereas the other landed on a fast machine.
In terms of difference between graphene-sgx-apps
and graphene-sgx-18.04-apps
jobs -- there is no difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: all files reviewed, 6 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @jinengandhi-intel, and @woju)
a discussion (no related file):
@jinengandhi-intel Could you please rewrite your commit messages to the format/convention we use? See https://graphene.readthedocs.io/en/latest/devel/contributing.html and https://commit.style/.
Adding LTP test suite to run with GSGX
GSGX is the name of our deprecated driver (see https://github.com/oscarlab/graphene-sgx-driver/). Please take a look at the contributing guide and use a proper component tag instead :)
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r4 (raw file):
cd LibOS/shim/test/ltp # To run ltp tests with SGX and work around the mmap issue,
ltp
-> LTP
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r4 (raw file):
the mmap issue
I think there isn't enough context here for the reader to know what "mmap issue" you have in mind. Could you briefly describe that LTP uses shared memory, which is not supported on SGX, so we need this hack? (and that unintuitively, it actually seems that LTP works with it)
.ci/lib/stage-test-sgx.jenkinsfile, line 13 at r4 (raw file):
make ${MAKEOPTS} all sgx-tokens make SGX=1 ltp-sgx.xml
SGX
is already set in env, no need to hardcode it here (see e.g. .ci/linux-sgx-ubuntu18.04-gcc-release-apps.jenkinsfile
)
656db94
to
3edc1e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 6 files reviewed, 6 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @mkow, and @woju)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
Will add it.
Done.
a discussion (no related file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Our Jenkins machines are quite different in terms of performance, from what I know (some have more CPUs, some have less; some have better SSD drives, some have worse HDD drives). So it could be that one of these jobs landed on a wimpy machine, whereas the other landed on a fast machine.
In terms of difference between
graphene-sgx-apps
andgraphene-sgx-18.04-apps
jobs -- there is no difference.
Increased the timeout to 45mins
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
@jinengandhi-intel Could you please rewrite your commit messages to the format/convention we use? See https://graphene.readthedocs.io/en/latest/devel/contributing.html and https://commit.style/.
Adding LTP test suite to run with GSGX
GSGX is the name of our deprecated driver (see https://github.com/oscarlab/graphene-sgx-driver/). Please take a look at the contributing guide and use a proper component tag instead :)
Done.
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
ltp
->LTP
Done.
.ci/lib/stage-test-sgx.jenkinsfile, line 7 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
the mmap issue
I think there isn't enough context here for the reader to know what "mmap issue" you have in mind. Could you briefly describe that LTP uses shared memory, which is not supported on SGX, so we need this hack? (and that unintuitively, it actually seems that LTP works with it)
Done.
.ci/lib/stage-test-sgx.jenkinsfile, line 13 at r4 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
SGX
is already set in env, no need to hardcode it here (see e.g..ci/linux-sgx-ubuntu18.04-gcc-release-apps.jenkinsfile
)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r5.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel, @mkow, and @woju)
a discussion (no related file):
Previously, jinengandhi-intel wrote…
Increased the timeout to 45mins
Let's try now.
.gitignore, line 3 at r5 (raw file):
/build /install /LibOS/shim/test/ltp/src
This will probably be not needed if you modify the other gitignore to /src/lib/tst_test.c
.
.ci/lib/stage-test-sgx.jenkinsfile, line 10 at r5 (raw file):
# is basically that LTP uses shared memory that isn't # supported on SGX, as a workaround we changed the mmap flag # from MAP_SHARED to MAP_PRIVATE inside the LTP framework.
Slightly better wording:
# To run LTP tests with SGX (which doesn't support shared memory mappings),
# as a workaround we change the mmap flag from MAP_SHARED to MAP_PRIVATE
# (unintuitively, the LTP framework still works correctly with this change)
LibOS/shim/test/ltp/.gitignore, line 5 at r5 (raw file):
/ltp*.xml /pal_loader lib/tst_test.c
Shouldn't it be /src/lib/tst_test.c
?
Jenkins, test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @jinengandhi-intel, @mkow, and @woju)
a discussion (no related file):
This time graphene-debug-18.04 job failed with mkdir09 test (timedout)
./runltp_xml.py -c ltp.cfg /home/jenkins/workspace/graphene-debug-18.04/LibOS/shim/test/ltp//install/runtest/syscalls -O ltp.xml
2021-02-26 07:31:31,166 LTP.mkdir09: cannot extract stdio on python < 3.7
2021-02-26 07:31:31,166 LTP.mkdir09: -> ERROR (Timed out after 30.0 s.)
2021-02-26 07:31:41,025 LTP: LTP finished tests=1327 passed=359 failures=0 errors=1 skipped=967 returncode=1
Makefile:68: recipe for target 'ltp.xml' failed
2 other SGX jobs failed with gitignore errors. I had updated the files but will re-check what went wrong there.
3edc1e8
to
fd7fc65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 5 files reviewed, 9 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @jinengandhi-intel, @mkow, and @woju)
LibOS/shim/test/ltp/.gitignore, line 5 at r5 (raw file):
Previously, dimakuv (Dmitrii Kuvaiskii) wrote…
Shouldn't it be
/src/lib/tst_test.c
?
Instead in the gitmodules file I added the ignore=dirty which will ignore the file changes in ltp/src.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r14.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv and @woju)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv and @woju)
LibOS/shim/test/ltp/ltp_sgx_mmap_fix.patch, line 3 at r11 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
+1, I don't think they'd accept such a patch. It should be rather done by reworking their IPC, not just patching it out, to be upstreamed.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 6 files at r10, 1 of 4 files at r11, 3 of 4 files at r13.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv and @jinengandhi-intel)
LibOS/shim/test/ltp/ltp_sgx_mmap_fix.patch, line 3 at r11 (raw file):
Previously, jinengandhi-intel wrote…
Done.
As I said, this patch is not meant for real upstreaming, it rather should be thrown at the mailing list to bait upstream response, i.e. to have them say "this is wrong, because $X", as we don't know LTP enought to know $X. For example it might cause fake TPASSes.
Anyway, I'm resolving this thread, because it looks OT.
LibOS/shim/test/ltp/Makefile, line 28 at r12 (raw file):
Previously, jinengandhi-intel wrote…
Done.
I don't think this might be delayed to another PR. Unless someone demonstrated that this patch actually doesn't break the tests also without SGX (i.e., even those that are skipped in SGX, but run without SGX).
If we leave this as is, someone is bound to execute the suite with the patch applied, but under non-sgx graphene.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv, @mkow, and @woju)
a discussion (no related file):
Previously, mkow (Michał Kowalczyk) wrote…
It should be
[CI] LTP: <commit message onliner here>
(LTP is a sub-component of LTP, or at least we treat it so)
Done from my side.
LibOS/shim/test/ltp/Makefile, line 24 at r12 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
+1 to woju, I'd remove this for simplicity (and because the patch checking is a bit hacky). If someone manually changes the files then it's expected that there will be problems.
Also, right now you just silently skip patching in case of conflicts, so if someone did an unrelated change which conflicts with the patch, then the patch will be silently skipped.
Done
LibOS/shim/test/ltp/Makefile, line 28 at r12 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
I don't think this might be delayed to another PR. Unless someone demonstrated that this patch actually doesn't break the tests also without SGX (i.e., even those that are skipped in SGX, but run without SGX).
If we leave this as is, someone is bound to execute the suite with the patch applied, but under non-sgx graphene.
I ran the LTP tests with Graphene non-SGX and see regression only for 2 tests, accept4_01 and select4, in both the cases all expected test pass but they each have a test that is skipped due to TCONF syscall not supported error. As I have already mentioned before, this changes does have some impact on the LTP parser and hence these tests are categorized as FAIL instead of PASS.
So it's not like this change breaks everything in non-SGX mode, also like I have said before we can make the community aware of such a limitation while we push the fix in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv and @jinengandhi-intel)
LibOS/shim/test/ltp/Makefile, line 28 at r12 (raw file):
Previously, jinengandhi-intel wrote…
I ran the LTP tests with Graphene non-SGX and see regression only for 2 tests, accept4_01 and select4, in both the cases all expected test pass but they each have a test that is skipped due to TCONF syscall not supported error. As I have already mentioned before, this changes does have some impact on the LTP parser and hence these tests are categorized as FAIL instead of PASS.
So it's not like this change breaks everything in non-SGX mode, also like I have said before we can make the community aware of such a limitation while we push the fix in a separate PR.
"Only" for 2 tests?!
So no, this MAP_PRIVATE patch is too hacky for me then. I haven't seen the LTP with this patch been properly verified and validated to actually accomplish its goals. This patch severely lowers confidence in the test suite without any attempt to make convincing argument that LTP still works as advertised (that is, it PASSes when it should, and also FAILs when it should) beyond "I ran this once and it only 2 tests failed".
So I insist that some measure is needed which would satisfy that this patch won't be applied to LTP that is run for non-SGX graphene. I.e., if user runs make SGX=1 regression && make regression
, the second make
should run unmodified LTP. I suggest that different builddir is created for LTP with this patch. If LTP build system doesn't support such a case, you can always copy the directory before applying the patch and compile in different directory depending on $(SGX)
variable.
Please do it in this PR, because I don't believe it would be ever done if postponed to a "separate PR".
aaafb91
to
7d62b94
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 9 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @mkow, and @woju)
LibOS/shim/test/ltp/Makefile, line 28 at r12 (raw file):
Previously, woju (Wojtek Porczyk) wrote…
"Only" for 2 tests?!
So no, this MAP_PRIVATE patch is too hacky for me then. I haven't seen the LTP with this patch been properly verified and validated to actually accomplish its goals. This patch severely lowers confidence in the test suite without any attempt to make convincing argument that LTP still works as advertised (that is, it PASSes when it should, and also FAILs when it should) beyond "I ran this once and it only 2 tests failed".
So I insist that some measure is needed which would satisfy that this patch won't be applied to LTP that is run for non-SGX graphene. I.e., if user runs
make SGX=1 regression && make regression
, the secondmake
should run unmodified LTP. I suggest that different builddir is created for LTP with this patch. If LTP build system doesn't support such a case, you can always copy the directory before applying the patch and compile in different directory depending on$(SGX)
variable.Please do it in this PR, because I don't believe it would be ever done if postponed to a "separate PR".
Based on whether SGX is set or not, I set the BUILDDIR and INSTALLDIR in the makevars.mk, which then flows in to the Makefile. In the Makefile the PATCH_APPLIED will be called or both SGX and non-SGX, simply because for Graphene non-SGX you would need to revert the patch. The solution works for both Graphene non-SGX and SGX where the patch is not applied for SGX and for non-SGX even if there are artifacts from the SGX build, it will be reversed.
LibOS/shim/test/ltp/Makefile, line 86 at r15 (raw file):
# LTPSCENARIO := $(INSTALLDIR)/$(LTPSCENARIO) ./contrib/conf_lint.py $(CFG) --scenario $(LTPSCENARIO) ./runltp_xml.py $(RUNLTPOPTS) $(foreach cfg,$(CFG),-c $(cfg)) $(LTPSCENARIO) -o ltproot=$(LTPROOT) -O $@
Additionally had to add this as runltp_xml.py by default uses the ltproot as install/testcases/bin and here for SGX I will pass the install-sgx/testcases/bin
7d62b94
to
e4afa19
Compare
Jenkins, retest Jenkins-SGX-apps please |
Jenkins, retest Jenkins-18.04 please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r15, 1 of 1 files at r16.
Reviewable status: all files reviewed, 5 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @jinengandhi-intel, and @woju)
LibOS/shim/test/ltp/Makefile, line 30 at r16 (raw file):
# (unintuitively, the LTP framework still works correctly with this change) # No such change is intended for LTP tests with non-SGX. $(eval export patch-status = $(shell patch -p1 -l -N -s --dry-run < ltp_sgx_mmap_fix.patch 1>&2 2> /dev/null; echo $$?))
Accepting this for now as discussed in private - @jinengandhi-intel will submit another PR later which will split source into 2 directories, one with patches and one without, so that we don't have the problem with patching/removing patch.
LibOS/shim/test/ltp/Makefile, line 84 at r16 (raw file):
%.xml: $(CFG) $(target) $(INSTALLDIR)/INSTALL_SUCCESS # LTPSCENARIO := $(INSTALLDIR)/$(LTPSCENARIO)
Why is this commented out?
e4afa19
to
19acfb6
Compare
Signed-off-by: Jinen Gandhi <[email protected]>
Jenkins, retest Jenkins-18.04 please |
Jenkins, retest Jenkins-Debug please |
Jenkins, retest Jenkins-SGX-18.04-apps please |
Jenkins, retest Jenkins-SGX-18.04 please |
Jenkins, retest Jenkins-SGX please |
Jenkins, retest Jenkins-18.04 please |
Jenkins, retest Jenkins-Debug-18.04 please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 8 of 9 files reviewed, 5 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv, @mkow, and @woju)
LibOS/shim/test/ltp/Makefile, line 84 at r16 (raw file):
Previously, mkow (Michał Kowalczyk) wrote…
Why is this commented out?
Done. Have removed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r17.
Reviewable status: all files reviewed, 4 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: ITL) (waiting on @dimakuv, @jinengandhi-intel, and @woju)
LibOS/shim/test/ltp/Makefile, line 86 at r15 (raw file):
Previously, jinengandhi-intel wrote…
Additionally had to add this as runltp_xml.py by default uses the ltproot as install/testcases/bin and here for SGX I will pass the install-sgx/testcases/bin
@jinengandhi-intel I guess this comment wasn't supposed to be blocking? (see the disposition status, you're blocking)
Jenkins, retest Jenkins please |
Jenkins, retest Jenkins-SGX-apps please |
2 similar comments
Jenkins, retest Jenkins-SGX-apps please |
Jenkins, retest Jenkins-SGX-apps please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r13, 1 of 2 files at r15, 1 of 1 files at r16.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mkow and @woju)
Jenkins, retest Jenkins please |
Jenkins, retest Jenkins-SGX-apps please |
Description of the changes
To run LTP tests with SGX (which doesn't support shared memory mappings), as a workaround changed the mmap flag from MAP_SHARED to MAP_PRIVATE (unintuitively) the LTP test execution works correctly with this change.
How to test this PR?
This change is