-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Research alternative to Bresenham's raytracing algorithm #167
Comments
Note benchmarks are included in #160. Integer arithmetic-based Bresenham algorithm benchmarkRun on (16 X 3200 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x8)
L1 Instruction 32 KiB (x8)
L2 Unified 512 KiB (x8)
L3 Unified 16384 KiB (x1)
Load Average: 0.52, 1.47, 2.08
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------
BM_Bresenham2i/Standard/128 92.1 ns 92.1 ns 7475814
BM_Bresenham2i/Standard/256 181 ns 181 ns 3802972
BM_Bresenham2i/Standard/512 360 ns 360 ns 1933997
BM_Bresenham2i/Standard/1024 704 ns 704 ns 974018
BM_Bresenham2i/Standard/2048 1411 ns 1410 ns 495547
BM_Bresenham2i/Standard/4096 2800 ns 2799 ns 250246
BM_Bresenham2i/Standard_BigO 0.69 N 0.68 N
BM_Bresenham2i/Standard_RMS 1 % 1 %
BM_Bresenham2i/Modified/128 390 ns 390 ns 1794217
BM_Bresenham2i/Modified/256 764 ns 764 ns 900910
BM_Bresenham2i/Modified/512 1527 ns 1526 ns 457221
BM_Bresenham2i/Modified/1024 3054 ns 3054 ns 228170
BM_Bresenham2i/Modified/2048 6136 ns 6135 ns 113027
BM_Bresenham2i/Modified/4096 12276 ns 12275 ns 56818
BM_Bresenham2i/Modified_BigO 3.00 N 3.00 N
BM_Bresenham2i/Modified_RMS 0 % 0 % Ray casting benchmarkRun on (16 X 3200 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x8)
L1 Instruction 32 KiB (x8)
L2 Unified 512 KiB (x8)
L3 Unified 16384 KiB (x1)
Load Average: 0.89, 1.00, 1.61
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
BM_RayCasting2d/128 175 ns 175 ns 3972583
BM_RayCasting2d/256 298 ns 298 ns 2362109
BM_RayCasting2d/512 589 ns 589 ns 1169115
BM_RayCasting2d/1024 1168 ns 1168 ns 589929
BM_RayCasting2d_BigO 1.15 N 1.15 N
BM_RayCasting2d_RMS 3 % 3 % These numbers don't look terrible, but when ray tracing 100 times for 2000 different states in a ~400 x ~400 grid, it adds up quickly to ~50 ms. I did some profiling and while we could shave off some of it by eliminating Eigen types, it'll probably take a different algorithm (approximate, multi-resolution, whatever) to get it in control for denser grids. |
It's great that we have the benchmarks already. I think when deciding if it's worth to pursue further optimization we need to evaluate the execution time in the context of the total execution time per iteration. A 5x improvement in Bresenham will be barely be noticeable if bresenham only accounts for 15% of the total iteration time, but it may be crucial if it represents 80% of the iteration time. |
FWIW I don't think we can improve the computation itself much further. We need to reduce the number of computations. |
Related to #167. This patch extends existing microbenchmarks to use as reference to compare the performance of the raycast and occupancy grid implementation. Signed-off-by: Nahuel Espinosa <[email protected]>
Description
Definition of done
Either the current raytracing approach is not worth optimizing at this stage, or a decision is done that the alternative is worth implementing (in a separate issue).
Additional considerations
This ticket is not about implementing changes, it's about considering alternatives.
The text was updated successfully, but these errors were encountered: