Skip to content

Latest commit

 

History

History
89 lines (68 loc) · 1.98 KB

File metadata and controls

89 lines (68 loc) · 1.98 KB

Dataset prepareration

BDD100K dataset download and create image from movie

${BDD100k-Path} is place of dataset which you determined.

requiment

  • parallel
  • aria2c or wget
  • ffmpeg

create dataset using multiprocessing

if you want to create quickely, run job by multiprocessing

  1. download dataset

    bash get_data/download_videos.sh ${BDD100k-Path}
  2. unzip dataset

    bash get_data/unzip_videos.sh ${BDD100k-Path}
  3. create directory for images which you use in training

    bash get_data/mkdir_train_val_img.sh ${BDD100k-Path}
  4. finally, create images using multiprocessing

    Ex. create 1900/process using 37 process(machine)

    1st node

    bash get_data/create_img.sh ${BDD100k-Path} 1 1900

    2nd node

    bash get_data/create_img.sh ${BDD100k-Path} 1901 1900

    :
    n-th node

    bash get_data/create_img.sh ${BDD100k-Path} (n-1)*1900+1 1900

    :
    37th node

    bash get_data/create_img.sh ${BDD100k-Path} 68401 1900

create dataset using singleprocess

if you don't mind time to create dataset, run following command

bash process_bdd.sh ${BDD100k-Path}

Final data structure

data structure after completing the above instructions

${BDD100k-Path}
 |-- bdd100k
 |   |-- videos # 1.5TB
 |   |   |-- train # 1.3TB
 |   |   |-- val # 184GB
 |   |-- images # 3.5TB
 |   |   |-- train # 3.1TB
 |   |   |-- val # 443GB