Skip to content

Commit

Permalink
Merge pull request #240 from dusty-nv/20241218-super
Browse files Browse the repository at this point in the history
Super Nano updates
  • Loading branch information
dusty-nv authored Dec 18, 2024
2 parents 7e866c3 + 0311bad commit a302e6f
Show file tree
Hide file tree
Showing 3 changed files with 97 additions and 28 deletions.
104 changes: 82 additions & 22 deletions docs/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,44 +5,104 @@ hide:

# Benchmarks

## Large Language Models (LLM)
!!! admonition "WIP - Updating Results"

![](./svgs/LLM%20Text%20Generation%20Rate.svg)
Below is recent data from the Jetson Orin Nano Super benchmarks - see [here](https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost/?ncid=so-othe-293081-vt48) for more info.

For running LLM benchmarks, see the [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
Currently in the process of collating these across AGX Orin and Orin NX - for now, the previous results are archived [below](#large-language-models-llm).

## Small Language Models (SLM)
#### Jetson Orin Nano Super

![](./svgs/SLM%20Text%20Generation%20Rate.svg)
=== "LLM / SLM"

Small language models are generally defined as having fewer than 7B parameters *(Llama-7B shown for reference)*
For more data and info about running these models, see the [`SLM`](tutorial_slm.md) tutorial and [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
<img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/Figure-1.-LLM-performance-boost-on-Jetson-Orin-Nano-Super-Developer-Kit.png">

## Vision Language Models (VLM)
| Model | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
|--------------|:---------------------------:|:----------------------:|:-------------:|
| Llama 3.1 8B | 14 | 19.14 | 1.37 |
| Llama 3.2 3B | 27.7 | 43.07 | 1.55 |
| Qwen2.5 7B | 14.2 | 21.75 | 1.53 |
| Gemma 2 2B | 21.5 | 34.97 | 1.63 |
| Gemma 2 9B | 7.2 | 9.21 | 1.28 |
| Phi 3.5 3B | 24.7 | 38.1 | 1.54 |
| SmolLM2 | 41 | 64.5 | 1.57 |

![](./svgs/Multimodal%20Streaming%20Rate.svg)
For running these benchmarks, this [script](https://github.com/dusty-nv/jetson-containers/blob/master/packages/llm/mlc/benchmark.sh) will launch a series of containers that download/build/run the models with MLC and INT4 quantization.

This measures the end-to-end pipeline performance for continuous streaming like with [Live Llava](tutorial_live-llava.md).
For more data and info about running these models, see the [`NanoVLM`](tutorial_nano-vlm.md) tutorial.
```
git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh
bash jetson-containers/packages/llm/mlc/benchmarks.sh
```

## Vision Transformers (ViT)
=== "Vision / Language Models"

![](./svgs/Vision%20Transformers.svg)
<img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/vision-language-models.png">

VIT performance data from [[1]](https://github.com/mit-han-lab/efficientvit#imagenet) [[2]](https://github.com/NVIDIA-AI-IOT/nanoowl#performance) [[3]](https://github.com/NVIDIA-AI-IOT/nanosam#performance)
| Model | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
|----------------|:---------------------------:|:----------------------:|:-------------:|
| VILA 1.5 3B | 0.7 | 1.06 | 1.51 |
| VILA 1.5 8B | 0.574 | 0.83 | 1.45 |
| LLAVA 1.6 7B | 0.412 | 0.57 | 1.38 |
| Qwen2 VL 2B | 2.8 | 4.4 | 1.57 |
| InternVL2.5 4B | 2.5 | 5.1 | 2.04 |
| PaliGemma2 3B | 13.7 | 21.6 | 1.58 |
| SmolVLM 2B | 8.1 | 12.9 | 1.59 |

## Stable Diffusion
=== "Vision Transformers"

![](./svgs/Stable%20Diffusion.svg)
<img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/vision-transformers.png">

## Riva
| Model | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
|-----------------------|:---------------------------:|:----------------------:|:-------------:|
| clip-vit-base-patch32 | 196 | 314 | 1.60 |
| clip-vit-base-patch16 | 95 | 161 | 1.69 |
| DINOv2-base-patch14 | 75 | 126 | 1.68 |
| SAM2 base | 4.42 | 6.34 | 1.43 |
| Grounding DINO | 4.11 | 6.23 | 1.52 |
| vit-base-patch16-224 | 98 | 158 | 1.61 |
| vit-base-patch32-224 | 171 | 273 | 1.60 |

![](./svgs/Riva%20Streaming%20ASR_TTS.svg)
#### Jetson AGX Orin

For running Riva benchmarks, see [ASR Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-performance.html) and [TTS Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tts/tts-performance.html).
=== "Large Language Models (LLM)"

## Vector Database
![](./svgs/LLM%20Text%20Generation%20Rate.svg)

![](./svgs/Vector%20Database%20Retrieval.svg)
For running LLM benchmarks, see the [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.

For running vector database benchmarks, see the [`NanoDB`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/vectordb/nanodb) container documentation.
=== "Small Language Models (SLM)"

![](./svgs/SLM%20Text%20Generation%20Rate.svg)

Small language models are generally defined as having fewer than 7B parameters *(Llama-7B shown for reference)*
For more data and info about running these models, see the [`SLM`](tutorial_slm.md) tutorial and [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.

=== "Vision Language Models (VLM)"

![](./svgs/Multimodal%20Streaming%20Rate.svg)

This measures the end-to-end pipeline performance for continuous streaming like with [Live Llava](tutorial_live-llava.md).
For more data and info about running these models, see the [`NanoVLM`](tutorial_nano-vlm.md) tutorial.

=== "Vision Transformers (ViT)"

![](./svgs/Vision%20Transformers.svg)

VIT performance data from [[1]](https://github.com/mit-han-lab/efficientvit#imagenet) [[2]](https://github.com/NVIDIA-AI-IOT/nanoowl#performance) [[3]](https://github.com/NVIDIA-AI-IOT/nanosam#performance)

=== "Stable Diffusion"

![](./svgs/Stable%20Diffusion.svg)

=== "Riva"

![](./svgs/Riva%20Streaming%20ASR_TTS.svg)

For running Riva benchmarks, see [ASR Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-performance.html) and [TTS Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tts/tts-performance.html).

=== "Vector Database"

![](./svgs/Vector%20Database%20Retrieval.svg)

For running vector database benchmarks, see the [`NanoDB`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/vectordb/nanodb) container documentation.
5 changes: 2 additions & 3 deletions docs/overrides/main.html
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,19 @@
{% extends "base.html" %}

<!-- Announcement bar -->
{#
{% block announce %}
<style>
.md-announce a { color: #76b900; text-decoration: underline;}
.md-announce a:focus { color: hsl(82, 100%, 72%); text-decoration: underline; }
.md-announce a:hover { color: hsl(82, 100%, 72%); text-decoration: underline;}
</style>
<div class="md-announce">Check out the new <a href="tutorial_jetson-copilot.html">Jetson Copilot</a> and <a href="agent_studio.html">Agent Studio</a> tools for building your own bots!</div>
<div class="md-announce">⛄❄️ <b>Jetson Orin Nano Super</b> now available for $249 &nbsp;<span style="opacity: 0.875">(up to 1.7X gains through <a href="https://developer.nvidia.com/embedded/jetpack">JetPack</a> update, see our <a href="https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost/?ncid=so-othe-293081-vt48" target="_blank">blog</a> for more info)</span></div>
<!--<div class="md-announce">The next research group meeting is on <a href="research.html#meeting-schedule">May 15th</a> at 9am PT! Catch up on the <a href="research.html#past-meetings">recordings</a> of the recent meetings.</div>-->
<!--<div class="md-announce">Congratulations to all the winners and participants of the <a href="https://blogs.nvidia.com/blog/glados-robot-hackster/" target="_blank">Hackster.io AI Innovation Challenge!</a></div>-->
<!--<div class="md-announce">Microsoft's open <a href="https://blogs.nvidia.com/blog/microsoft-open-phi-3-mini-language-models/" target="_blank">Phi-3 Mini</a> language models are out! Try them today on Jetson with <a href="tutorial_ollama.html">ollama</a>.</div>-->

{% endblock %}
#}

{% block scripts %}
<!-- OneTrust Cookies Consent Notice start for www.jetson-ai-lab.com -->
<script src="https://cdn.cookielaw.org/scripttemplates/otSDKStub.js" type="text/javascript" charset="UTF-8" data-domain-script="018e2d65-efdf-7071-b793-f15ccf25c234" ></script>
Expand Down
16 changes: 13 additions & 3 deletions docs/research.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,11 @@ The Jetson AI Lab Research Group is a global collective for advancing open-sourc

There are virtual [meetings](#meeting-schedule) that anyone is welcome to join, offline discussion on the [Jetson Projects](https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-projects/78){:target="_blank"} forum, and guidelines for upstreaming open-source [contributions](#contribution-guidelines).

!!! abstract "Next Meeting - 12/10"
!!! abstract "Next Meeting - 1/7"
<!--The next team meeting is on Tuesday, June 11<sup>th</sup> at 9am PST. View the [recording](#past-meetings) from the last meeting below.-->
The next team meeting is on Tuesday, December 10<sup>th</sup> at 9am PST - see the [invite](#meeting-schedule) below or click [here](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"} to join the meeting in progress.
With holiday schedules soon taking effect, we will reconvene in 2025 - thank you everyone for an amazing year!<br/><br/>
In the meantime, enjoy the time with your families, and feel welcome to keep in touch through the forums, Discord, or LinkedIn.</br><br/>
The next team meeting is on Tuesday, January 7<sup>th</sup> at 9am PST - see the [invite](#meeting-schedule) below or click [here](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"} to join the meeting in progress.

## Topics of Interest

Expand Down Expand Up @@ -75,7 +77,7 @@ Ongoing technical discussions are encouraged to occur on the forums or GitHub Is

We'll aim to meet monthly or bi-weekly as a team in virtual meetings that anyone is welcome to join and speak during. We'll discuss the latest updates and experiments that we want to explore. Please remain courteous to others during the calls. We'll stick around after for anyone who has questions or didn't get the chance to be heard.

!!! abstract "Tuesday December 10<sup>th</sup> at 9am PST (12/10/24)"
!!! abstract "Tuesday January 7<sup>th</sup> at 9am PST (1/7/24)"

- Microsoft Teams - [Meeting Link](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"}
- Meeting ID: `264 770 145 196`
Expand All @@ -100,6 +102,14 @@ The agenda will be listed here beforehand - post to the forum to add agenda item

## Past Meetings

<details open><summary>Recordings Archive</summary>

<div><br>Due to a backlog of editing/posting the previous meetings, here is a link containing the raw footage:<br>

<ul><li><code><a href="https://drive.google.com/drive/folders/18BC7o32jorx_LzZXx5wW0Io_nf1ZwO6X?usp=sharing" target="_blank">https://drive.google.com/drive/folders/18BC7o32jorx_LzZXx5wW0Io_nf1ZwO6X?usp=sharing</a></code><br></li></ul>

</details>

<details open><summary>November 12, 2024</summary>

<div><iframe width="850" height="476" src="https://www.youtube.com/embed/3QxRUdgnbJw" style="margin-top: 1em;" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe></div>
Expand Down

0 comments on commit a302e6f

Please sign in to comment.