You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for sharing your work and providing the implementation!
While reviewing the code, I noticed a few differences between the implementation and the description in the paper. In the paper, during the 2D-to-3D construction process, it mentions that a 3D model is utilized to extract features for each point. However, in the code, it seems that CLIP features are being used instead.
Additionally, the paper describes processing features for top views (as outlined in OpenMask3D), but in the code, it appears that the CLIP features are computed for the entire frame instead.
Could you clarify if I might be misunderstanding something here? Thank you!
The text was updated successfully, but these errors were encountered:
Hi, thank you for sharing your work and providing the implementation!
While reviewing the code, I noticed a few differences between the implementation and the description in the paper. In the paper, during the 2D-to-3D construction process, it mentions that a 3D model is utilized to extract features for each point. However, in the code, it seems that CLIP features are being used instead.
Additionally, the paper describes processing features for top views (as outlined in OpenMask3D), but in the code, it appears that the CLIP features are computed for the entire frame instead.
Could you clarify if I might be misunderstanding something here? Thank you!
The text was updated successfully, but these errors were encountered: