Introduction

Official implementation of paper:

RefCap: Zero-shot Video Corpus Moment Retrieval Based on Refined Dense Video Captioning

Environment Setup

conda create -n refcap python=3.10
conda activate refcap

apt-get update
apt-get install ffmpeg

pip install -r requirements.txt

python -m spacy download en_core_web_sm

follow Glove to prepare glove weights, and place them in meta folder:
- ```
meta
|--glove.6B
|  |--glove.6B.300d.txt
```

Construction Stage

BLIP as VLLM

#Add project root to PYTHONPATH (Note that you need to do this each time you start a new session.)
source setup.sh

# set ${video_root}, ${collection} correctly in scripts/construct.sh
# set ${caption_generator} to 'blip' in scripts/construct.sh 
bash scripts/construct.sh

Note: for ActivityNet dataset, please merge all videos from split v1-2 and v1-3 and place them in one folder.

generated intermediate results will be placed in meta folder (generated only once for specified collection and caption_generator):

meta
|--captions							// containing generated frame captions
|  |--${collection}_${caption_generator}.jsonl
|--framefeatures  					// containing extracted blip frame features
|  |--${collection}.pt
|--scores 							// containing caption-frame similarity scores before denoising
|  |--${collection}_{caption_generator}.pt
|--glove.6B
|  |--glove.6B.300d.txt

Our generated intermediate results are provided in Baidu Cloud Disk and Google Drive

generated construction results will be placed in results/construct/${collection}/${construct_name}(per construction)

results
|--construct
|  |--${collection}
|  |  |--${construct_name}
|  |  |  |--settings.json
|  |  |  |--denoised_capframe_scores.pt
|  |  |  |--denoised_captions.jsonl
|  |  |  |--prop_sims.pt
|  |  |  |--proposals.json
|  |  |  |--tree.json

MiniGPT as VLLM

For MiniGPT, please follow the official guidance to download MiniGPT-4, prepare environment and weights. Then place our provided script utils/genCaptions_minigpt.py in the root directory of MiniGPT-4, and run the script as:
- ```
python genCaptions_minigpt.py --collection $collection --save_dir $save_dir --video_root $video_root --split_json_path $split_json_path --temperature $temperature
```
- Generated captions will be saved in ${save_dir}/${collection}_minigpt.jsonl
- Intermediate results are also provided in Baidu Cloud Disk and Google Drive

Other operations are the same as BLIP:

source setup.sh

# set ${video_root}, ${collection} correctly in scripts/construct.sh
# set ${caption_generator} to 'minigpt' in scripts/construct.sh 
bash scripts/construct.sh

Retrieval Stage

source setup.sh

# set ${collection}, ${construct_name}, ${retrieve_name} correctly in scripts/retrieve.sh
# construction results in results/construct/${collection}/${construct_name} will be used for retrieval
bash scripts/retrieve.sh

retrieval results will be placed in results/retrieve/${collection}/${retrieve_name}:

results
|--construct  // containing construct results
|--retrieve
|  |--${collection}
|  |  |--${retrieve_name}
|  |  |  |--build_settings.json
|  |  |  |--eval_settings.json
|  |  |  |--metrics.json
|  |  |  |--vcmr_preds.json

We provide construction and retrieval results for ActivityNet/Charades datasets and Blip/MiniGPT VLLMs in Baidu Cloud Disk and Google Drive.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
annos		annos
config		config
dataset		dataset
pipeline		pipeline
scripts		scripts
standalone_eval		standalone_eval
utils		utils
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
construct.py		construct.py
requirements.txt		requirements.txt
retrieve.py		retrieve.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Environment Setup

Construction Stage

BLIP as VLLM

MiniGPT as VLLM

Retrieval Stage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

BUAAPY/RefCap

Folders and files

Latest commit

History

Repository files navigation

Introduction

Environment Setup

Construction Stage

BLIP as VLLM

MiniGPT as VLLM

Retrieval Stage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages