-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Linemod dataset #14
Comments
These are object NOCS images and you can find more information hughw19/NOCS_CVPR2019#62 or use blender proc. But please note that you don't need these NOCS map for training our repo i.e. shapo, if you already have the GT 6D pose and sizes of rendered images (which i think Linemod already provides). Please see my answer here #11 (comment). You would have to save the relevant pose, image and depth data as datapoints here to train shapo. This information could be retrieved in any form i.e. 6D pose or size estimated from GT NOCS or just the GT 6D poses if available! |
Thank you, I will have a look. I thought that it was needed because you do processing of all the images in the camera train folder. |
In linemod I have the GT 6D, will I still need to generate the files in the folder sdf_rgb_pretrained? |
Yes, you would still have to train SDF and RGB MLPs as well as the respective latent codes per object (if your categories are different than the categories we train on i.e. bottle, bowl, camera, mug and laptop) since our network requires them as a strong prior which we regress and later optimize from a single view observations. Please see this thread #13 on how you can train these for your own dataset. |
@zubair-irshad hi, thanks for your work.. Can you tell me what means the sizes of rendered images. I see the datapoint using *_norm.txt |
Also for YCB dataset |
@peng25zhang the transformations are defined by R,T,s where R is a 3by3 rotation matrix, T is a 3by1 translation vector and s is a one-dimensional scale value. The scale value determines the scaling factor of the observed instance from canonical shape i.e. point clouds where the size is the 3dimensional extent of the canonical pointclouds. In inference time, we are only given a single RGB-D observation, so we get size information from our predicted shape i.e. extracted pointclouds (canonical) and we regress R,T,s values from a neural network MLP and hence we could transform the canonical point-clouds to camera frame pointclouds. Hope it helps! |
Hi,
Do you think it would be possible to run shapo with the linemod dataset if I follow the tips of some issues related to the use of custom datasets.
Thank you!
The text was updated successfully, but these errors were encountered: