Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed headers level and added diagram with white background #267

Merged
merged 4 commits into from
Jan 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 16 additions & 16 deletions notebooks/en/fine_tuning_smol_vlm_sft_trl.ipynb

Large diffs are not rendered by default.

24 changes: 12 additions & 12 deletions notebooks/en/fine_tuning_vlm_dpo_smolvlm_instruct.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
"id": "R-7khk_xFuZZ"
},
"source": [
"# 1. Install Dependencies\n",
"## 1. Install Dependencies\n",
"\n",
"Let’s start by installing the essential libraries we’ll need for fine-tuning! 🚀"
]
Expand Down Expand Up @@ -232,7 +232,7 @@
"id": "t-zGbB9OGTo6"
},
"source": [
"# 3. Fine-Tune the Model using TRL\n",
"## 3. Fine-Tune the Model using TRL\n",
"\n"
]
},
Expand All @@ -242,7 +242,7 @@
"id": "irI99bhxzpVM"
},
"source": [
"## 3.1 Load the Quantized Model for Training ⚙️\n",
"### 3.1 Load the Quantized Model for Training ⚙️\n",
"\n",
"\n",
"Let's first load a quantized version of the SmolVLM-Instruct model using bitsandbytes, and let's also load the processor. We'll use [SmolVLM-Instruct](https://huggingface.co/HuggingFaceTB/SmolVLM-Instruct)."
Expand Down Expand Up @@ -297,7 +297,7 @@
"id": "AwDDBxIqGjDV"
},
"source": [
"## 3.2 Set Up QLoRA and DPOConfig 🚀\n",
"### 3.2 Set Up QLoRA and DPOConfig 🚀\n",
"\n",
"In this step, we’ll configure [QLoRA](https://github.com/artidoro/qlora) for our training setup. **QLoRA** is a powerful fine-tuning technique designed to reduce the memory footprint, making it possible to fine-tune large models efficiently, even on limited hardware.\n",
"\n",
Expand Down Expand Up @@ -463,7 +463,7 @@
"id": "n2eD3ZwHzl-U"
},
"source": [
"# 4. Testing the Fine-Tuned Model 🔍\n",
"## 4. Testing the Fine-Tuned Model 🔍\n",
"\n",
"With our Vision Language Model (VLM) fine-tuned, it’s time to evaluate its performance! In this section, we’ll test the model using examples from the [HuggingFaceH4/rlaif-v_formatted](https://huggingface.co/datasets/HuggingFaceH4/rlaif-v_formatted) dataset. Let’s dive into the results and assess how well the model aligns with the preferred responses! 🚀\n",
"\n",
Expand Down Expand Up @@ -770,11 +770,7 @@
},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"<IPython.lib.display.IFrame at 0x7926fa3b35e0>"
],
"text/html": [
"\n",
" <iframe\n",
Expand All @@ -786,10 +782,14 @@
" \n",
" ></iframe>\n",
" "
],
"text/plain": [
"<IPython.lib.display.IFrame at 0x7926fa3b35e0>"
]
},
"execution_count": 1,
"metadata": {},
"execution_count": 1
"output_type": "execute_result"
}
],
"source": [
Expand All @@ -804,7 +804,7 @@
"id": "Znti4_dk39av"
},
"source": [
"# 6. Continuing the Learning Journey 🧑‍🎓️\n",
"## 5. Continuing the Learning Journey 🧑‍🎓️\n",
"\n",
"Expand your knowledge of Vision Language Models and related tools with these resources:\n",
"\n",
Expand Down Expand Up @@ -838,4 +838,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
6,022 changes: 32 additions & 5,990 deletions notebooks/en/multiagent_rag_system.ipynb

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions notebooks/en/search_and_learn.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
"id": "twKCzVIg71Xa"
},
"source": [
"# 1. Install Dependencies\n",
"## 1. Install Dependencies\n",
"\n",
"Let’s start by installing the [search-and-learn](https://github.com/huggingface/search-and-learn) repository! 🚀 \n",
"This repo is designed to replicate the experimental results and is not a Python pip package. However, we can still use it to generate our system. To do so, we’ll need to install it from source with the following steps:"
Expand Down Expand Up @@ -130,7 +130,7 @@
"id": "wX07zCTA8MWL"
},
"source": [
"# 2. Setup the Large Language Model (LLM) and the Process Reward Model (PRM) 💬\n",
"## 2. Setup the Large Language Model (LLM) and the Process Reward Model (PRM) 💬\n",
"\n",
"As illustrated in the diagram, the system consists of an LLM that generates intermediate answers based on user input, a [PRM model](https://huggingface.co/papers/2211.14275) that evaluates and scores these answers, and a search strategy that uses the PRM feedback to guide the subsequent steps in the search process until reaching the final answer.\n",
"\n",
Expand Down Expand Up @@ -395,7 +395,7 @@
"id": "xYtPn0_V_YRx"
},
"source": [
"## 2.1 Instantiate the Question, Search Strategy, and Call the Pipeline\n",
"### 2.1 Instantiate the Question, Search Strategy, and Call the Pipeline\n",
"\n",
"Now that we've set up the LLM and PRM, let's proceed by defining the question, selecting a search strategy to retrieve relevant information, and calling the pipeline to process the question through the models.\n",
"\n",
Expand Down Expand Up @@ -470,7 +470,7 @@
"id": "lsLHD_6C_15p"
},
"source": [
"## 2.2 Display the Final Result\n",
"### 2.2 Display the Final Result\n",
"\n",
"Once the pipeline has processed the question through the LLM and PRM, we can display the final result. This result will be the model's output after considering the intermediate answers and scoring them using the PRM.\n",
"\n",
Expand Down Expand Up @@ -606,7 +606,7 @@
"id": "4uCpYzAw_4o9"
},
"source": [
"# 3. Assembling It All! 🧑‍🏭️\n",
"## 3. Assembling It All! 🧑‍🏭️\n",
"\n",
"Now, let's create a method that encapsulates the entire pipeline. This will allow us to easily reuse the process in future applications, making it efficient and modular.\n",
"\n",
Expand Down Expand Up @@ -673,7 +673,7 @@
"id": "RWbOqkiKPVd2"
},
"source": [
"## ⏳ 3.1 Comparing Thinking Time for Each Strategy\n",
"### ⏳ 3.1 Comparing Thinking Time for Each Strategy\n",
"\n",
"Let’s compare the **thinking time** of three methods: `best_of_n`, `beam_search`, and `dvts`. Each method is evaluated using the same number of answers during the search process, measuring the time spent thinking in seconds and the number of generated tokens.\n",
"\n",
Expand All @@ -694,7 +694,7 @@
"id": "2ROJwROGX8q-"
},
"source": [
"### 1. **Best of n**\n",
"#### 1. **Best of n**\n",
"\n",
"We’ll begin by using the `best_of_n` strategy. Here’s how to track the thinking time for this method:"
]
Expand Down Expand Up @@ -779,7 +779,7 @@
"id": "7S9AwP5lQvUN"
},
"source": [
"### 2. **Beam Search**\n",
"#### 2. **Beam Search**\n",
"\n",
"Now, let's try using the `beam_search` strategy."
]
Expand Down Expand Up @@ -886,7 +886,7 @@
"id": "GxBBUd7HQzhd"
},
"source": [
"### 3. **Diverse Verifier Tree Search (DVTS)**\n",
"#### 3. **Diverse Verifier Tree Search (DVTS)**\n",
"\n",
"Finally, let's try the `dvts` strategy."
]
Expand Down Expand Up @@ -988,7 +988,7 @@
"id": "5PM9HHwBSYWk"
},
"source": [
"## 🙋 3.2 Testing the System with a Simple Question\n",
"### 🙋 3.2 Testing the System with a Simple Question\n",
"\n",
"In this final example, we’ll test the system using a straightforward question to observe how it performs in simpler cases. This allows us to verify that the system works as expected even for basic queries.\n",
"\n",
Expand Down Expand Up @@ -1073,7 +1073,7 @@
"id": "92znAyJ0AOPY"
},
"source": [
"# 4. Continuing the Journey and Resources 🧑‍🎓️\n",
"## 4. Continuing the Journey and Resources 🧑‍🎓️\n",
"\n",
"If you're eager to continue exploring, be sure to check out the original experimental [blog](https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute) and all the references mentioned within it. These resources will deepen your understanding of test-time compute, its benefits, and its applications in LLMs.\n",
"\n",
Expand Down