Using Reinforcement Learning to Guide Chains of Thought

Sets of TRL and/or SFT jobs:

Launch jobs with

./job_sets/launch_sets.py <job_set_name>

Check the status with:

./job_sets/check_status.py

Where the reinforcement learning is located.

./with_trl/launch.py <experiment_name>

./approach_sft/launch.py <experiment_name>

Name		Name	Last commit message	Last commit date
Latest commit History 147 Commits
accelerate_configs		accelerate_configs
analysis		analysis
approach_sft		approach_sft
archive		archive
general		general
job_sets		job_sets
mlc_datasets		mlc_datasets
notebooks		notebooks
with_open-instruct		with_open-instruct
with_trl		with_trl
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
setup.py		setup.py