Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Your Name committed Jan 9, 2025
1 parent 3da21d2 commit b5be3bb
Show file tree
Hide file tree
Showing 4 changed files with 13 additions and 10 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,8 +176,8 @@ Visit our [documentation](https://verl.readthedocs.io/en/latest/index.html) to l
- Advance Usage and Extension
- [Ray API Design Tutorial](https://verl.readthedocs.io/en/latest/advance/placement.html)
- [Extend to other RL(HF) algorithms](https://verl.readthedocs.io/en/latest/advance/dpo_extension.html)
- [Add models to FSDP backend](https://verl.readthedocs.io/en/latest/advance/fsdp_extension.html)
- [Add models to Megatron-LM backend](https://verl.readthedocs.io/en/latest/advance/megatron_extension.html)
- [Add models with the FSDP backend](https://verl.readthedocs.io/en/latest/advance/fsdp_extension.html)
- [Add models with the Megatron-LM backend](https://verl.readthedocs.io/en/latest/advance/megatron_extension.html)


## Citation
Expand All @@ -201,3 +201,4 @@ Visit our [documentation](https://verl.readthedocs.io/en/latest/index.html) to l
## Publications Using veRL
- [Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization](https://arxiv.org/abs/2410.09302)
- [Flaming-hot Initiation with Regular Execution Sampling for Large Language Models](https://arxiv.org/abs/2410.21236)
- [Process Reinforcement Through Implicit Rewards](https://github.com/PRIME-RL/PRIME/)
4 changes: 2 additions & 2 deletions docs/advance/fsdp_extension.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

Add models to FSDP backend
===========================
Add models with the FSDP backend
==================================

Model
--------------------------
Expand Down
11 changes: 6 additions & 5 deletions docs/advance/megatron_extension.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
Add models to Megatron-LM backend
===================================
Add models with the Megatron-LM backend
=========================================

Model
-----------

The most challenging aspect to use Megatron-LM backend is implementing
The most challenging aspect to use the Megatron-LM backend is implementing
the models for training. Currently, we implement Llama model that
support data parallelism, tensor parallelism, pipeline parallelism (also
vPP) and sequence parallelism. We also implement remove padding on Llama
vPP) and sequence parallelism. We also implement remove padding (sequence packing) on Llama
model, which can be found in `modeling_llama_megatron.py <https://github.com/volcengine/verl/blob/main/verl/models/llama/megatron/modeling_llama_megatron.py>`_.

To support other model, users are required to implement:
Expand All @@ -22,4 +22,5 @@ To support other model, users are required to implement:
(vLLM) model. Note that both the actor model and rollout model are
partitioned during runtime. So, it's advisable to map the model name
in actor model implementation. Otherwise, you may need an additional
name mapping and even weight transformation.
name mapping and even weight transformation. The weight loader implementation
is in `megatron_weight_loaders.py <https://github.com/volcengine/verl/blob/main/verl/third_party/vllm/vllm_v_0_6_3/megatron_weight_loaders.py>`_.
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ pybind11
ray
tensordict<0.6
transformers
vllm<=0.6.3
vllm<=0.6.3
wandb

0 comments on commit b5be3bb

Please sign in to comment.