+50% slow down #33

keryell · 2021-02-05T10:30:28Z

I have a slowdown on the recent version of the path_tracer on my CPU laptop.
If you notice the same, can you bisect the change?
Perhaps some objects we copy instead of using by ref (in lambda?).
Check also with the various sanitizers that there are no weird bugs.

The text was updated successfully, but these errors were encountered:

lforg37 · 2021-02-05T10:33:53Z

I had also an important slowdown related to copy inside lambda.
On my side #32 fixes the problem.
Can you check if it works for you ?

keryell · 2021-02-05T11:18:27Z

No. but I think I have found why...

lforg37 · 2021-02-05T15:35:08Z

I have a slowdown on the recent version of the path_tracer on my CPU laptop.
If you notice the same, can you bisect the change?

On my experimentation machine (8 virtual cores) I go from 8'50" total execution time (~2' wallclock time) to 14'30" between commits df12334 and fca7dcf .

Interestingly, I tried to remove as many pointer as possible in lforg37:avoid_pointers and the total execution time increases even more (16'31").

lforg37 · 2021-02-05T19:06:50Z

It seems that having one render_function using internal lambdas leads to better performance.
With the same settings on 3c3c152 I get 10'15" total execution time.
I have no clue yet where the extra 1'30" can be saved..

lforg37 · 2021-02-05T20:03:23Z

It seems there is an overhead when using sycl::access instead of raw pointers : with 7640afd I'm reaching 9'10" total execution time.

I'm not convince the resulting code is way cleaner than using object functions...

keryell · 2021-02-09T06:09:19Z

So do you mean that just using a triSYCL accessor in a kernel is the slow-down?
For example using a[i] is slower than a.get_pointer()[i]?
Perhaps a bug in accessor implementation...
For now I am using Boost.Multi_Array. At some point we need to switch to md_span and md_array.

keryell · 2021-02-09T06:11:10Z

On my laptop it runs in 1 min normally, 1min30 with the performance bug. Curious these 10 min.

lforg37 · 2021-02-09T14:05:05Z

On my laptop it runs in 1 min normally, 1min30 with the performance bug. Curious these 10 min.

Maybe a quiproquo on the reported time : I'm reporting the total execution time. In wallclock time I'm also having something which is around two minutes on a 8 core machine (Xeon CPU E3-1230 v6 @ 3.50GHz).

So do you mean that just using a triSYCL accessor in a kernel is the slow-down?
For example using a[i] is slower than a.get_pointer()[i]?

It's only part of the slow down. The important part of the slow down was having lambdas captured inside lambdas : hit_world() and get_color() were defined outside of render_pixel(). Moving there definition inside render_pixel() body was already an improvement (3c3c152).

However, replacing trisycl accessor with raw pointer leads also to an improvement (7640afd).

I don't see a conceptual difference between the render_pixel() lambda of 7640afd and the initial function object, but there remain a slight slow down (around 15" total execution time on my setting, so probably something like 2" wall clock time, persistent across multiple run) that I cannot explain.

keryell · 2021-02-09T15:08:34Z

Have you tried with triSYCL Clang++ and G++?
It would be interesting to compare with DPC++ and hipSYCL in CPU mode.

keryell assigned lforg37 Feb 5, 2021

keryell added the bug Something isn't working label Feb 5, 2021

keryell mentioned this issue Feb 5, 2021

Remove any pointer usage in the code #34

Open

lforg37 mentioned this issue Feb 6, 2021

Alternative work distribution #36

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

+50% slow down #33

+50% slow down #33

keryell commented Feb 5, 2021

lforg37 commented Feb 5, 2021 •

edited

Loading

keryell commented Feb 5, 2021

lforg37 commented Feb 5, 2021 •

edited

Loading

lforg37 commented Feb 5, 2021

lforg37 commented Feb 5, 2021

keryell commented Feb 9, 2021

keryell commented Feb 9, 2021

lforg37 commented Feb 9, 2021 •

edited

Loading

keryell commented Feb 9, 2021

+50% slow down #33

+50% slow down #33

Comments

keryell commented Feb 5, 2021

lforg37 commented Feb 5, 2021 • edited Loading

keryell commented Feb 5, 2021

lforg37 commented Feb 5, 2021 • edited Loading

lforg37 commented Feb 5, 2021

lforg37 commented Feb 5, 2021

keryell commented Feb 9, 2021

keryell commented Feb 9, 2021

lforg37 commented Feb 9, 2021 • edited Loading

keryell commented Feb 9, 2021

lforg37 commented Feb 5, 2021 •

edited

Loading

lforg37 commented Feb 5, 2021 •

edited

Loading

lforg37 commented Feb 9, 2021 •

edited

Loading