Merge pull request #240 from dusty-nv/20241218-super

Super Nano updates
NVIDIA-AI-IOT · Dec 18, 2024 · a302e6f · a302e6f
2 parents 7e866c3 + 0311bad
commit a302e6f
Show file tree

Hide file tree

Showing 3 changed files with 97 additions and 28 deletions.
diff --git a/docs/benchmarks.md b/docs/benchmarks.md
@@ -5,44 +5,104 @@ hide:
 
 # Benchmarks
 
-## Large Language Models (LLM)
+!!! admonition "WIP - Updating Results"
 
-![](./svgs/LLM%20Text%20Generation%20Rate.svg)
+    Below is recent data from the Jetson Orin Nano Super benchmarks - see [here](https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost/?ncid=so-othe-293081-vt48) for more info.  
 
-For running LLM benchmarks, see the [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
+    Currently in the process of collating these across AGX Orin and Orin NX - for now, the previous results are archived [below](#large-language-models-llm).
 
-## Small Language Models (SLM)
+#### Jetson Orin Nano Super
 
-![](./svgs/SLM%20Text%20Generation%20Rate.svg)
+=== "LLM / SLM"
 
-Small language models are generally defined as having fewer than 7B parameters *(Llama-7B shown for reference)*   
-For more data and info about running these models, see the [`SLM`](tutorial_slm.md) tutorial and [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
+    <img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/Figure-1.-LLM-performance-boost-on-Jetson-Orin-Nano-Super-Developer-Kit.png">
 
-## Vision Language Models (VLM)
+    | Model        | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
+    |--------------|:---------------------------:|:----------------------:|:-------------:|
+    | Llama 3.1 8B |              14             |          19.14         |      1.37     |
+    | Llama 3.2 3B |             27.7            |          43.07         |      1.55     |
+    | Qwen2.5 7B   |             14.2            |          21.75         |      1.53     |
+    | Gemma 2 2B   |             21.5            |          34.97         |      1.63     |
+    | Gemma 2 9B   |             7.2             |          9.21          |      1.28     |
+    | Phi 3.5 3B   |             24.7            |          38.1          |      1.54     |
+    | SmolLM2      |              41             |          64.5          |      1.57     |
 
-![](./svgs/Multimodal%20Streaming%20Rate.svg)
+    For running these benchmarks, this [script](https://github.com/dusty-nv/jetson-containers/blob/master/packages/llm/mlc/benchmark.sh) will launch a series of containers that download/build/run the models with MLC and INT4 quantization.
 
-This measures the end-to-end pipeline performance for continuous streaming like with [Live Llava](tutorial_live-llava.md).  
-For more data and info about running these models, see the [`NanoVLM`](tutorial_nano-vlm.md) tutorial.
+    ```
+    git clone https://github.com/dusty-nv/jetson-containers
+    bash jetson-containers/install.sh
+    bash jetson-containers/packages/llm/mlc/benchmarks.sh
+    ```
 
-## Vision Transformers (ViT)
+=== "Vision / Language Models"
 
-![](./svgs/Vision%20Transformers.svg)
+    <img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/vision-language-models.png">
 
-VIT performance data from [[1]](https://github.com/mit-han-lab/efficientvit#imagenet) [[2]](https://github.com/NVIDIA-AI-IOT/nanoowl#performance)  [[3]](https://github.com/NVIDIA-AI-IOT/nanosam#performance)
+    | Model          | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
+    |----------------|:---------------------------:|:----------------------:|:-------------:|
+    | VILA 1.5 3B    |             0.7             |          1.06          |      1.51     |
+    | VILA 1.5 8B    |            0.574            |          0.83          |      1.45     |
+    | LLAVA 1.6 7B   |            0.412            |          0.57          |      1.38     |
+    | Qwen2 VL 2B    |             2.8             |           4.4          |      1.57     |
+    | InternVL2.5 4B |             2.5             |           5.1          |      2.04     |
+    | PaliGemma2 3B  |             13.7            |          21.6          |      1.58     |
+    | SmolVLM 2B     |             8.1             |          12.9          |      1.59     |
 
-## Stable Diffusion
+=== "Vision Transformers"
 
-![](./svgs/Stable%20Diffusion.svg)
+    <img src="https://developer-blogs.nvidia.com/wp-content/uploads/2024/12/vision-transformers.png">
 
-## Riva
+    | Model                 | Jetson Orin Nano (original) | Jetson Orin Nano Super | Perf Gain (X) |
+    |-----------------------|:---------------------------:|:----------------------:|:-------------:|
+    | clip-vit-base-patch32 |             196             |           314          |      1.60     |
+    | clip-vit-base-patch16 |              95             |           161          |      1.69     |
+    | DINOv2-base-patch14   |              75             |           126          |      1.68     |
+    | SAM2 base             |             4.42            |          6.34          |      1.43     |
+    | Grounding DINO        |             4.11            |          6.23          |      1.52     |
+    | vit-base-patch16-224  |              98             |           158          |      1.61     |
+    | vit-base-patch32-224  |             171             |           273          |      1.60     |
 
-![](./svgs/Riva%20Streaming%20ASR_TTS.svg)
+#### Jetson AGX Orin
 
-For running Riva benchmarks, see [ASR Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-performance.html) and [TTS Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tts/tts-performance.html).
+=== "Large Language Models (LLM)"
 
-## Vector Database
+    ![](./svgs/LLM%20Text%20Generation%20Rate.svg)
 
-![](./svgs/Vector%20Database%20Retrieval.svg)
+    For running LLM benchmarks, see the [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
 
-For running vector database benchmarks, see the [`NanoDB`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/vectordb/nanodb) container documentation.
+=== "Small Language Models (SLM)"
+
+    ![](./svgs/SLM%20Text%20Generation%20Rate.svg)
+
+    Small language models are generally defined as having fewer than 7B parameters *(Llama-7B shown for reference)*   
+    For more data and info about running these models, see the [`SLM`](tutorial_slm.md) tutorial and [`MLC`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/llm/mlc) container documentation.
+
+=== "Vision Language Models (VLM)"
+
+    ![](./svgs/Multimodal%20Streaming%20Rate.svg)
+
+    This measures the end-to-end pipeline performance for continuous streaming like with [Live Llava](tutorial_live-llava.md).  
+    For more data and info about running these models, see the [`NanoVLM`](tutorial_nano-vlm.md) tutorial.
+
+=== "Vision Transformers (ViT)"
+
+    ![](./svgs/Vision%20Transformers.svg)
+
+    VIT performance data from [[1]](https://github.com/mit-han-lab/efficientvit#imagenet) [[2]](https://github.com/NVIDIA-AI-IOT/nanoowl#performance)  [[3]](https://github.com/NVIDIA-AI-IOT/nanosam#performance)
+
+=== "Stable Diffusion"
+
+    ![](./svgs/Stable%20Diffusion.svg)
+
+=== "Riva"
+
+    ![](./svgs/Riva%20Streaming%20ASR_TTS.svg)
+
+    For running Riva benchmarks, see [ASR Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/asr/asr-performance.html) and [TTS Performance](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/tts/tts-performance.html).
+
+=== "Vector Database"
+
+    ![](./svgs/Vector%20Database%20Retrieval.svg)
+
+    For running vector database benchmarks, see the [`NanoDB`](https://github.com/dusty-nv/jetson-containers/tree/master/packages/vectordb/nanodb) container documentation.
diff --git a/docs/overrides/main.html b/docs/overrides/main.html
@@ -2,20 +2,19 @@
 {% extends "base.html" %}
 
 <!-- Announcement bar -->
-{#
 {% block announce %}
   <style>
     .md-announce a { color: #76b900; text-decoration: underline;}
     .md-announce a:focus { color: hsl(82, 100%, 72%);  text-decoration: underline; }
     .md-announce a:hover { color: hsl(82, 100%, 72%); text-decoration: underline;}
   </style>
-    <div class="md-announce">Check out the new <a href="tutorial_jetson-copilot.html">Jetson Copilot</a> and <a href="agent_studio.html">Agent Studio</a> tools for building your own bots!</div>
+    <div class="md-announce">⛄❄️ <b>Jetson Orin Nano Super</b> now available for $249 &nbsp;<span style="opacity: 0.875">(up to 1.7X gains through <a href="https://developer.nvidia.com/embedded/jetpack">JetPack</a> update, see our <a href="https://developer.nvidia.com/blog/nvidia-jetson-orin-nano-developer-kit-gets-a-super-boost/?ncid=so-othe-293081-vt48" target="_blank">blog</a> for more info)</span></div>
     <!--<div class="md-announce">The next research group meeting is on <a href="research.html#meeting-schedule">May 15th</a> at 9am PT! Catch up on the <a href="research.html#past-meetings">recordings</a> of the recent meetings.</div>-->
     <!--<div class="md-announce">Congratulations to all the winners and participants of the <a href="https://blogs.nvidia.com/blog/glados-robot-hackster/" target="_blank">Hackster.io AI Innovation Challenge!</a></div>-->
     <!--<div class="md-announce">Microsoft's open <a href="https://blogs.nvidia.com/blog/microsoft-open-phi-3-mini-language-models/" target="_blank">Phi-3 Mini</a> language models are out!  Try them today on Jetson with <a href="tutorial_ollama.html">ollama</a>.</div>-->
 
 {% endblock %}
-#}
+
 {% block scripts %}
 <!-- OneTrust Cookies Consent Notice start for www.jetson-ai-lab.com -->
 <script src="https://cdn.cookielaw.org/scripttemplates/otSDKStub.js"  type="text/javascript" charset="UTF-8" data-domain-script="018e2d65-efdf-7071-b793-f15ccf25c234" ></script>

diff --git a/docs/research.md b/docs/research.md
@@ -10,9 +10,11 @@ The Jetson AI Lab Research Group is a global collective for advancing open-sourc
 
 There are virtual [meetings](#meeting-schedule) that anyone is welcome to join, offline discussion on the [Jetson Projects](https://forums.developer.nvidia.com/c/agx-autonomous-machines/jetson-embedded-systems/jetson-projects/78){:target="_blank"} forum, and guidelines for upstreaming open-source [contributions](#contribution-guidelines). 
 
-!!! abstract "Next Meeting - 12/10"
+!!! abstract "Next Meeting - 1/7"
     <!--The next team meeting is on Tuesday, June 11<sup>th</sup> at 9am PST.  View the [recording](#past-meetings) from the last meeting below.-->
-    The next team meeting is on Tuesday, December 10<sup>th</sup> at 9am PST - see the [invite](#meeting-schedule) below or click [here](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"} to join the meeting in progress.
+    With holiday schedules soon taking effect, we will reconvene in 2025 - thank you everyone for an amazing year!<br/><br/>
+	In the meantime, enjoy the time with your families, and feel welcome to keep in touch through the forums, Discord, or LinkedIn.</br><br/>
+	The next team meeting is on Tuesday, January 7<sup>th</sup> at 9am PST - see the [invite](#meeting-schedule) below or click [here](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"} to join the meeting in progress.
 
 ## Topics of Interest
 
@@ -75,7 +77,7 @@ Ongoing technical discussions are encouraged to occur on the forums or GitHub Is
 
 We'll aim to meet monthly or bi-weekly as a team in virtual meetings that anyone is welcome to join and speak during.  We'll discuss the latest updates and experiments that we want to explore.  Please remain courteous to others during the calls.  We'll stick around after for anyone who has questions or didn't get the chance to be heard.
 
-!!! abstract "Tuesday December 10<sup>th</sup> at 9am PST (12/10/24)"
+!!! abstract "Tuesday January 7<sup>th</sup> at 9am PST (1/7/24)"
 
 	- Microsoft Teams - [Meeting Link](https://teams.microsoft.com/l/meetup-join/19%3ameeting_NTA4ZmE4MDAtYWUwMS00ZTczLWE0YWEtNTE5Y2JkNTFmOWM1%40thread.v2/0?context=%7b%22Tid%22%3a%2243083d15-7273-40c1-b7db-39efd9ccc17a%22%2c%22Oid%22%3a%221f165bb6-326c-4610-b292-af9159272b08%22%7d){:target="_blank"} 
 	- Meeting ID: `264 770 145 196`
@@ -100,6 +102,14 @@ The agenda will be listed here beforehand - post to the forum to add agenda item
 
 ## Past Meetings
 
+<details open><summary>Recordings Archive</summary>
+
+	<div><br>Due to a backlog of editing/posting the previous meetings, here is a link containing the raw footage:<br>
+
+	<ul><li><code><a href="https://drive.google.com/drive/folders/18BC7o32jorx_LzZXx5wW0Io_nf1ZwO6X?usp=sharing" target="_blank">https://drive.google.com/drive/folders/18BC7o32jorx_LzZXx5wW0Io_nf1ZwO6X?usp=sharing</a></code><br></li></ul>
+
+</details>
+
 <details open><summary>November 12, 2024</summary>
 
 <div><iframe width="850" height="476" src="https://www.youtube.com/embed/3QxRUdgnbJw" style="margin-top: 1em;" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe></div>