Research alternative to Bresenham's raytracing algorithm #167

glpuga · 2023-05-01T15:39:09Z

Description

Evaluate the performance of Bresenham's algorithm for the Bean Sensor Model to determine if it's worth optimizing
Research this paper as a potential alternative

Definition of done

Either the current raytracing approach is not worth optimizing at this stage, or a decision is done that the alternative is worth implementing (in a separate issue).

Additional considerations

This ticket is not about implementing changes, it's about considering alternatives.

hidmic · 2023-05-03T16:20:07Z

Evaluate the performance of Bresenham's algorithm for the Bean Sensor Model to determine if it's worth optimizing

Note benchmarks are included in #160.

Integer arithmetic-based Bresenham algorithm benchmark

Run on (16 X 3200 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.52, 1.47, 2.08
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_Bresenham2i/Standard/128        92.1 ns         92.1 ns      7475814
BM_Bresenham2i/Standard/256         181 ns          181 ns      3802972
BM_Bresenham2i/Standard/512         360 ns          360 ns      1933997
BM_Bresenham2i/Standard/1024        704 ns          704 ns       974018
BM_Bresenham2i/Standard/2048       1411 ns         1410 ns       495547
BM_Bresenham2i/Standard/4096       2800 ns         2799 ns       250246
BM_Bresenham2i/Standard_BigO       0.69 N          0.68 N    
BM_Bresenham2i/Standard_RMS           1 %             1 %    
BM_Bresenham2i/Modified/128         390 ns          390 ns      1794217
BM_Bresenham2i/Modified/256         764 ns          764 ns       900910
BM_Bresenham2i/Modified/512        1527 ns         1526 ns       457221
BM_Bresenham2i/Modified/1024       3054 ns         3054 ns       228170
BM_Bresenham2i/Modified/2048       6136 ns         6135 ns       113027
BM_Bresenham2i/Modified/4096      12276 ns        12275 ns        56818
BM_Bresenham2i/Modified_BigO       3.00 N          3.00 N    
BM_Bresenham2i/Modified_RMS           0 %             0 %

Ray casting benchmark

Run on (16 X 3200 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.89, 1.00, 1.61
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
---------------------------------------------------------------
BM_RayCasting2d/128         175 ns          175 ns      3972583
BM_RayCasting2d/256         298 ns          298 ns      2362109
BM_RayCasting2d/512         589 ns          589 ns      1169115
BM_RayCasting2d/1024       1168 ns         1168 ns       589929
BM_RayCasting2d_BigO       1.15 N          1.15 N    
BM_RayCasting2d_RMS           3 %             3 %

These numbers don't look terrible, but when ray tracing 100 times for 2000 different states in a ~400 x ~400 grid, it adds up quickly to ~50 ms. I did some profiling and while we could shave off some of it by eliminating Eigen types, it'll probably take a different algorithm (approximate, multi-resolution, whatever) to get it in control for denser grids.

glpuga · 2023-05-04T13:11:21Z

It's great that we have the benchmarks already.

I think when deciding if it's worth to pursue further optimization we need to evaluate the execution time in the context of the total execution time per iteration.

A 5x improvement in Bresenham will be barely be noticeable if bresenham only accounts for 15% of the total iteration time, but it may be crucial if it represents 80% of the iteration time.

hidmic · 2023-05-04T13:44:10Z

FWIW I don't think we can improve the computation itself much further. We need to reduce the number of computations.

nahueespinosa · 2023-05-25T14:29:42Z

@glpuga FYI, here are the flamegraphs from likelihood field and beam models for comparison.
Downloading the files and re-opening them in Chrome should enable the zoom feature.

Main version: e72e148

Related to #167. This patch extends existing microbenchmarks to use as reference to compare the performance of the raycast and occupancy grid implementation. Signed-off-by: Nahuel Espinosa <[email protected]>

glpuga added the enhancement New feature or request label May 1, 2023

glpuga mentioned this issue May 1, 2023

Add beam sensor model #160

Merged

6 tasks

nahueespinosa mentioned this issue May 29, 2023

Extend raycast microbenchmarks #201

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research alternative to Bresenham's raytracing algorithm #167

Research alternative to Bresenham's raytracing algorithm #167

glpuga commented May 1, 2023

hidmic commented May 3, 2023 •

edited

Loading

glpuga commented May 4, 2023

hidmic commented May 4, 2023

nahueespinosa commented May 25, 2023 •

edited

Loading

Research alternative to Bresenham's raytracing algorithm #167

Research alternative to Bresenham's raytracing algorithm #167

Comments

glpuga commented May 1, 2023

Description

Definition of done

Additional considerations

hidmic commented May 3, 2023 • edited Loading

glpuga commented May 4, 2023

hidmic commented May 4, 2023

nahueespinosa commented May 25, 2023 • edited Loading

hidmic commented May 3, 2023 •

edited

Loading

nahueespinosa commented May 25, 2023 •

edited

Loading