Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research alternative to Bresenham's raytracing algorithm #167

Open
2 tasks
glpuga opened this issue May 1, 2023 · 4 comments
Open
2 tasks

Research alternative to Bresenham's raytracing algorithm #167

glpuga opened this issue May 1, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@glpuga
Copy link
Collaborator

glpuga commented May 1, 2023

Description

  • Evaluate the performance of Bresenham's algorithm for the Bean Sensor Model to determine if it's worth optimizing
  • Research this paper as a potential alternative

Definition of done

Either the current raytracing approach is not worth optimizing at this stage, or a decision is done that the alternative is worth implementing (in a separate issue).

Additional considerations

This ticket is not about implementing changes, it's about considering alternatives.

@glpuga glpuga added the enhancement New feature or request label May 1, 2023
@glpuga glpuga mentioned this issue May 1, 2023
6 tasks
@hidmic
Copy link
Collaborator

hidmic commented May 3, 2023

Evaluate the performance of Bresenham's algorithm for the Bean Sensor Model to determine if it's worth optimizing

Note benchmarks are included in #160.

Integer arithmetic-based Bresenham algorithm benchmark
Run on (16 X 3200 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.52, 1.47, 2.08
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_Bresenham2i/Standard/128        92.1 ns         92.1 ns      7475814
BM_Bresenham2i/Standard/256         181 ns          181 ns      3802972
BM_Bresenham2i/Standard/512         360 ns          360 ns      1933997
BM_Bresenham2i/Standard/1024        704 ns          704 ns       974018
BM_Bresenham2i/Standard/2048       1411 ns         1410 ns       495547
BM_Bresenham2i/Standard/4096       2800 ns         2799 ns       250246
BM_Bresenham2i/Standard_BigO       0.69 N          0.68 N    
BM_Bresenham2i/Standard_RMS           1 %             1 %    
BM_Bresenham2i/Modified/128         390 ns          390 ns      1794217
BM_Bresenham2i/Modified/256         764 ns          764 ns       900910
BM_Bresenham2i/Modified/512        1527 ns         1526 ns       457221
BM_Bresenham2i/Modified/1024       3054 ns         3054 ns       228170
BM_Bresenham2i/Modified/2048       6136 ns         6135 ns       113027
BM_Bresenham2i/Modified/4096      12276 ns        12275 ns        56818
BM_Bresenham2i/Modified_BigO       3.00 N          3.00 N    
BM_Bresenham2i/Modified_RMS           0 %             0 %
Ray casting benchmark
Run on (16 X 3200 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 16384 KiB (x1)
Load Average: 0.89, 1.00, 1.61
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
---------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
---------------------------------------------------------------
BM_RayCasting2d/128         175 ns          175 ns      3972583
BM_RayCasting2d/256         298 ns          298 ns      2362109
BM_RayCasting2d/512         589 ns          589 ns      1169115
BM_RayCasting2d/1024       1168 ns         1168 ns       589929
BM_RayCasting2d_BigO       1.15 N          1.15 N    
BM_RayCasting2d_RMS           3 %             3 %

These numbers don't look terrible, but when ray tracing 100 times for 2000 different states in a ~400 x ~400 grid, it adds up quickly to ~50 ms. I did some profiling and while we could shave off some of it by eliminating Eigen types, it'll probably take a different algorithm (approximate, multi-resolution, whatever) to get it in control for denser grids.

@glpuga
Copy link
Collaborator Author

glpuga commented May 4, 2023

It's great that we have the benchmarks already.

I think when deciding if it's worth to pursue further optimization we need to evaluate the execution time in the context of the total execution time per iteration.

A 5x improvement in Bresenham will be barely be noticeable if bresenham only accounts for 15% of the total iteration time, but it may be crucial if it represents 80% of the iteration time.

@hidmic
Copy link
Collaborator

hidmic commented May 4, 2023

FWIW I don't think we can improve the computation itself much further. We need to reduce the number of computations.

@nahueespinosa
Copy link
Member

nahueespinosa commented May 25, 2023

@glpuga FYI, here are the flamegraphs from likelihood field and beam models for comparison.
Downloading the files and re-opening them in Chrome should enable the zoom feature.

Main version: e72e148

likelihood_flamegraph
beam_flamegraph

nahueespinosa added a commit that referenced this issue May 31, 2023
Related to #167.

This patch extends existing microbenchmarks to use as reference to
compare the performance of the raycast and occupancy grid
implementation.

Signed-off-by: Nahuel Espinosa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants