Clone this repository and install packages:
git clone https://github.com/LaVi-Lab/TG-Vid.git
cd TG-Vid
conda create --name tg python=3.10
conda activate tg
pip install -r requirement.txt
-
Annotation json files for training and testing are provided in Huggingface.
Please download the corresponding videos from their official websites.
-
Download the pretrained model weights:
Pretrained Model Weight Download Link lmsys/vicuna-7b-v1.1 Huggingface EVA ViT-g Link QFormer Link -
Note: you have to modify the path to data & pretrained model weights in the scripts & codes & configs. The easiest way is to search for
/path/to/
.
For quick usage, we have provided the checkpoints of TG-Vid-197K and TG-Vid-220K as follows:
Model | Download Link |
---|---|
TG-Vid-197K | Huggingface |
TG-Vid-220K | Huggingface |
If you want to reproduce the training of TG-Vid, you can follow the following scripts:
Note: We use an AWS-like platform (e.g., mmengine.fileio) to store the training videos (stllm/datasets/datasets/image_video_itdatasets.py, has_client = True
). If you store the training videos locally, please refer to ST-LLM.
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash script/train/train.sh TG-Vid-197K
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash script/train/train.sh TG-Vid-220K
Check inference/*/test_*.sh for more details:
model=$1
gpu=$2
output="test_output/mvbench/${model}/"
mkdir -p $output
...
--cfg-path config/$model.yaml \
--ckpt-path output/${model}/pytorch_model.bin \
...
Note: you have to modify the path to annotation files & videos in the scripts & codes. The easiest way is to search for /path/to/
.
Take model=TG-Vid-197K
as an example:
bash script/inference/mvbench/test_mvbench.sh TG-Vid-197K 0
bash script/inference/tempcompass/test_tempcompass.sh TG-Vid-197K 0
bash script/inference/nextqa/test_nextqa.sh TG-Vid-197K 0
bash script/inference/nextqa_atp_hard/test_nextqa_atp_hard.sh TG-Vid-197K 0
If you find this repo useful, please consider citing our paper:
@article{hu2024tgvid,
title={Enhancing Temporal Modeling of Video LLMs via Time Gating},
author={Zi-Yuan Hu, Yiwu Zhong, Shijia Huang, Michael R. Lyu and Liwei Wang},
journal={arXiv preprint arXiv:2410.05714},
year={2024}
}