Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to LeRobot Dataset v2.0? #40

Open
ivelin opened this issue Dec 6, 2024 · 5 comments
Open

Port to LeRobot Dataset v2.0? #40

ivelin opened this issue Dec 6, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@ivelin
Copy link

ivelin commented Dec 6, 2024

Hi folks. Congrats on a great SOTA model!

Is there any interest in porting RDT1B to lerobot API?

I see there is a huggingface model upload, but the dataset format is not in the lerobot dataset v2.0 format.
https://huggingface.co/spaces/lerobot/visualize_dataset

I am looking at porting the data ingestion pipeline, but wanted to check if someone here is already doing that.

Thank you! 🙏🏼

@csuastt
Copy link
Collaborator

csuastt commented Dec 10, 2024

We will consider making it a TODO:)

@ethan-iai ethan-iai added the enhancement New feature or request label Dec 17, 2024
@villekuosmanen
Copy link

villekuosmanen commented Dec 30, 2024

Hey @csuastt and @ivelin.

Not sure what exactly you mean by "port to LeRobot" but I am planning to implement a mechanism to fine-tune the pre-trained RDT model using a LeRobot dataset 2.0. In practice this would mean implementing a LeRobot dataset loader similar to data/hdf5_vla_dataset.py.

Not sure exactly how much time I have for this but expecting to raise a PR within the next few weeks :) feel free to assign this issue to me

@csuastt
Copy link
Collaborator

csuastt commented Jan 1, 2025

Hi @villekuosmanen,

Thank you so much for taking the initiative to work on this! 🎉 We really appreciate your enthusiasm and willingness to contribute to the project. Your idea sounds fantastic, and we're excited to check your implementation.

No rush at all—take your time, and feel free to reach out if you have any questions or need any assistance along the way. We are eagerly looking forward to your PR! 😊

@villekuosmanen
Copy link

I have a hacky data integration to my flavour of LeRobot Dataset v2 working now (it's on my fork if you are interested). I will make it more robust at some point, test better, and raise a PR but until then here is a super early checkpoint to demo it learning to (sort of) control the arms: https://x.com/VilleKuosmanen/status/1876697826169647412

How many steps would you expect to fine-tune the model until it can complete tasks at ~20% accuracy? In the paper you mention this:

The model is pre-trained on 48 H100 80GB GPUs for a month, giving a total of 1M training iteration steps. It takes three days to fine-tune this model using the same GPUs for 130K steps.

Is this accurate? In my experiments so far it took around 12h on an RTX 4090 (yep I am compute poor) to reach 60k optimisation steps - is your definition of step different to an optimisation step, or do you use a very large batch size?

Also thanks for open sourcing the model and work!

@LBG21
Copy link
Contributor

LBG21 commented Jan 8, 2025

Hi @villekuosmanen, we utilized a batch size of $32$ per H100 GPU, resulting in an effective total batch size of $32 \times 48 = 1536$ across 48 H100 GPUs. With smaller batch sizes, achieving better fine-tuning performance takes longer; however, we have not yet evaluated the results of RDT fine-tuned on an RTX 4090 (I think it should be batch size = 2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants