Performance benchmarking for NVIDIA-accelerated Isaac ROS packages.
Isaac ROS Benchmark builds upon the ros2_benchmark to provide configurations to benchmark Isaac ROS graphs. Performance results that measure Isaac ROS for throughput, latency, and utilization enable robotics developers to make informed decisions when designing real-time robotics applications. The Isaac ROS performance results can be independently verified, as the method, configuration, and data input used for benchmarking are provided.
The ros2_benchmark
playback node plug-in, for type adaptation and
negotiation, is provided for
NITROS, which
optimizes the performance of message transport costs through
RCL with GPU accelerated graphs of
nodes.
The datasets for benchmarking are explicitly not downloaded by default. To pull down the standardized benchmark datasets, refer to the ros2_benchmark Dataset section.
Please visit the Isaac ROS Documentation to learn how to use this repository.
Update 2024-12-10: Added new benchmarks
Node |
Input Size |
AGX Orin |
Orin NX |
Orin Nano Super 8GB |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|
AprilTag Node |
720p |
178 fps 6.3 ms @ 30Hz |
116 fps 9.4 ms @ 30Hz |
123 fps 8.6 ms @ 30Hz |
596 fps 0.86 ms @ 30Hz |
Freespace Segmentation Node |
576p |
3340 fps 1.7 ms @ 30Hz |
2530 fps 1.5 ms @ 30Hz |
2140 fps 1.9 ms @ 30Hz |
3500 fps 0.44 ms @ 30Hz |
Depth Segmentation Node |
576p |
41.4 fps 80 ms @ 30Hz |
28.1 fps 98 ms @ 30Hz |
– |
105 fps 25 ms @ 30Hz |
FoundationPose Pose Estimation Node |
720p |
1.54 fps 780 ms @ 30Hz |
– |
– |
9.56 fps 110 ms @ 30Hz |
DNN Stereo Disparity Node Full |
576p |
72.5 fps 17 ms @ 30Hz |
42.1 fps 26 ms @ 30Hz |
– |
350 fps 2.1 ms @ 30Hz |
DNN Stereo Disparity Node Light |
288p |
304 fps 5.9 ms @ 30Hz |
143 fps 9.6 ms @ 30Hz |
– |
350 fps 1.6 ms @ 30Hz |
Stereo Disparity Node |
1080p |
118 fps 11 ms @ 30Hz |
78.1 fps 14 ms @ 30Hz |
83.8 fps 13 ms @ 30Hz |
943 fps 1.6 ms @ 30Hz |
Rectify Node |
1080p |
800 fps 2.8 ms @ 30Hz |
572 fps 3.3 ms @ 30Hz |
595 fps 3.8 ms @ 30Hz |
2500 fps 0.57 ms @ 30Hz |
TensorRT Node DOPE |
VGA |
30.8 fps 37 ms @ 30Hz |
15.5 fps 55 ms @ 30Hz |
20.8 fps 51 ms @ 30Hz |
298 fps 5.3 ms @ 30Hz |
Triton Node DOPE |
VGA |
31.2 fps 340 ms @ 30Hz |
15.5 fps 55 ms @ 30Hz |
22.2 fps 490 ms @ 30Hz |
277 fps 4.7 ms @ 30Hz |
TensorRT Node PeopleSemSegNet |
544p |
489 fps 4.6 ms @ 30Hz |
258 fps 7.1 ms @ 30Hz |
269 fps 6.2 ms @ 30Hz |
619 fps 2.2 ms @ 30Hz |
Triton Node PeopleSemSegNet |
544p |
216 fps 5.5 ms @ 30Hz |
143 fps 8.2 ms @ 30Hz |
– |
585 fps 2.5 ms @ 30Hz |
DNN Image Encoder Node |
VGA |
339 fps 13 ms @ 30Hz |
375 fps 12 ms @ 30Hz |
– |
480 fps 6.0 ms @ 30Hz |
Occupancy Grid Localizer Node |
~50 sq. m |
19.6 fps 57 ms @ 30Hz |
8.36 fps 130 ms @ 30Hz |
9.02 fps 120 ms @ 30Hz |
50.1 fps 8.5 ms @ 30Hz |
H.264 Decoder Node |
1080p |
188 fps 7.3 ms @ 30Hz |
– |
– |
596 fps 2.6 ms @ 30Hz |
H.264 Encoder Node I-frame Support |
1080p |
402 fps 12 ms @ 30Hz |
– |
– |
412 fps 3.2 ms @ 30Hz |
H.264 Encoder Node P-frame Support |
1080p |
465 fps 11 ms @ 30Hz |
– |
– |
596 fps 2.0 ms @ 30Hz |
Nvblox Node |
– |
4.94 fps 34.0 ms |
4.94 fps 155 ms |
4.93 fps 87.4 ms |
4.94 fps 200 ms |
Graph |
Input Size |
AGX Orin |
Orin NX |
Orin Nano Super 8GB |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|
AprilTag Graph |
720p |
178 fps 9.1 ms @ 30Hz |
111 fps 11 ms @ 30Hz |
120 fps 11 ms @ 30Hz |
596 fps 1.4 ms @ 30Hz |
Freespace Segmentation Graph |
576p |
40.3 fps 79 ms @ 30Hz |
27.6 fps 98 ms @ 30Hz |
31.8 fps 55 ms @ 30Hz |
102 fps 30 ms @ 30Hz |
Centerpose Pose Estimation Graph |
VGA |
44.8 fps 43 ms @ 30Hz |
29.0 fps 51 ms @ 30Hz |
29.8 fps 50 ms @ 30Hz |
50.2 fps 14 ms @ 30Hz |
DOPE Pose Estimation Graph |
VGA |
27.3 fps 54 ms @ 30Hz |
15.2 fps 73 ms @ 30Hz |
– |
186 fps 12 ms @ 30Hz |
DNN Stereo Disparity Graph Full |
576p |
89.8 fps 19 ms @ 30Hz |
35.2 fps 34 ms @ 30Hz |
– |
350 fps 5.8 ms @ 30Hz |
DNN Stereo Disparity Graph Light |
288p |
184 fps 14 ms @ 30Hz |
128 fps 14 ms @ 30Hz |
– |
350 fps 5.2 ms @ 30Hz |
Stereo Disparity Graph |
1080p |
111 fps 15 ms @ 30Hz |
72.2 fps 18 ms @ 30Hz |
77.4 fps 18 ms @ 30Hz |
692 fps 4.6 ms @ 30Hz |
DetectNet Object Detection Graph |
544p |
55.4 fps 37 ms @ 30Hz |
25.7 fps 45 ms @ 30Hz |
33.0 fps 43 ms @ 30Hz |
262 fps 11 ms @ 30Hz |
RT-DETR Object Detection Graph SyntheticaDETR |
720p |
56.5 fps 29 ms @ 30Hz |
33.3 fps 40 ms @ 30Hz |
37.3 fps 37 ms @ 30Hz |
450 fps 5.5 ms @ 30Hz |
TensorRT Graph PeopleSemSegNet |
544p |
436 fps 10 ms @ 30Hz |
212 fps 13 ms @ 30Hz |
224 fps 13 ms @ 30Hz |
587 fps 3.7 ms @ 30Hz |
SAM Image Segmentation Graph Full SAM |
720p |
2.22 fps 390 ms @ 30Hz |
– |
– |
14.6 fps 74 ms @ 30Hz |
SAM Image Segmentation Graph Mobile SAM |
720p |
8.40 fps 120 ms @ 30Hz |
2.22 fps 240 ms @ 30Hz |
2.22 fps 230 ms @ 30Hz |
62.5 fps 22 ms @ 30Hz |
Live Graph |
Input Size |
Nova Carter |
---|---|---|
Data Recorder Live Graph 4 Hawk Cameras |
1200p |
22.4 fps (per stream avg) 0 dropped frames (avg) |
Multicam Visual SLAM Live Graph 4 Hawk Cameras |
1200p |
30.1 fps |
DNN Stereo Disparity Live Graph 3 Hawk Cameras 1x Full ESS and 2x Throttled Light ESS |
1200p |
Full: 30.2 fps Light: 15.2 fps (avg) |
Perceptor Graph 3 Hawk Cameras |
1200p |
Nvblox ESDF: 9.45 fps Nvblox Mesh: 2.63 fps Visual Odometry: 30.0 fps |