Lucene impact index of the MS MARCO V2 segmented document corpus for uniCOIL (noexp) with title prepended.
This index was generated on 2022/08/08 at Anserini commit fbe35e
on damiano
with the following command:
nohup target/appassembler/bin/IndexCollection \
-collection JsonVectorCollection \
-input /scratch2/collections/msmarco/msmarco_v2_doc_segmented_unicoil_noexp_0shot_v2 \
-index indexes/lucene-index.msmarco-v2-doc-segmented-unicoil-noexp-0shot.20220808.4d6d2a/ \
-generator DefaultLuceneDocumentGenerator \
-threads 18 -impact -pretokenized -optimize \
>& logs/log.msmarco-v2-doc-segmented-unicoil-noexp-0shot.20220808.4d6d2a.txt &
In May 2024, index was repackaged to adopt a more consistent naming scheme.