Skip to content

CLIPGraphs/CLIPGraphs.github.io

 
 

Repository files navigation

CLIPGraphs

This repository contains the code for obtaining the results for CLIPGraphs: Multimodal graph networks to infer object-room affinities for scene rearrangement

To reproduce the results corresponding to CLIP ViT-H/14 model:

python get_results.py --split test
python get_results.py --split val

This will print the statistics for the model with different metrics and generate a new file called GCN_model_output.txt containing the predicted object-room mappings.

To get the predicted object-room mappings for different language baselines, run the script:

python llm_baseline.py --lang_model glove

Here, you can replace glove with any of the following language models: roberta, glove, clip_convnext_base, clip_RN50, or clip_ViT-H/14.

This would produce 2 files,

  • lang_model_mAP.txt will contain the statistic metrics for the metric,
  • lang_model_output.txt will contain the object-room mappings generated by the lang_model.

If you find CLIPGraphs useful for your work please cite:

@INPROCEEDINGS{agrawal2023clipgraphs,
            title={CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities, 
            author={Ayush Agrawal and Raghav Arora and Ahana Datta and Snehasis Banerjee and Brojeshwar Bhowmick and Krishna Murthy Jatavallabhula and Mohan Sridharan and Madhava Krishna}},
            booktitle={2023 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN)},
            pages={2604-2609},
            year={2023},
            eprint={2306.01540},
            archivePrefix={arXiv},
            primaryClass={cs.RO},
            doi={10.1109/RO-MAN57019.2023.10309325}}

About

Github Pages website code for CLIPGraphs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%