Since this code is based on ScanRefer, you can use the same 3D features. Please also refer to the ScanRefer data preparation.
Download the ScanQA dataset under
."scene_id": [ScanNet scene id, e.g. "scene0000_00"], "object_id": [ScanNet object ids (corresponds to "objectId" in ScanNet aggregation file), e.g. "[8]"], "object_names": [ScanNet object names (corresponds to "label" in ScanNet aggregation file), e.g. ["cabinet"]], "question_id": [...], "question": [...], "answers": [...],
Download the preprocessed GLoVE embedding and put them under
. -
Download the ScanNetV2 dataset and put (or link)
under (or to)data/scannet/scans/
(Please follow the ScanNet Instructions for downloading the ScanNet dataset). -
Pre-process ScanNet data. A folder named
will be generated underdata/scannet/
after running the following command:cd data/scannet/ python
Hint: To Pre-Process ScanNet test-data, change line 16 and 17 to:
SCANNET_DIR = 'scans_test' SCAN_NAMES = sorted([line.rstrip() for line in open('meta_data/scannetv2_test.txt')])
Change the data paths in data/ marked with TODO accordingly.
(Optional) Pre-process the multiview features from ENet. a. Download the ENet pretrained weights and put it under
b. Download and unzip the extracted ScanNet frames under
c. Extract the ENet features:python scripts/
e. Project ENet features from ScanNet frames to point clouds:
python scripts/ --maxpool