Dataset prepareration

BDD100K dataset download and create image from movie

${BDD100k-Path} is place of dataset which you determined.

requiment

parallel
aria2c or wget
ffmpeg

create dataset using multiprocessing

if you want to create quickely, run job by multiprocessing

download dataset

bash get_data/download_videos.sh ${BDD100k-Path}

unzip dataset

bash get_data/unzip_videos.sh ${BDD100k-Path}

create directory for images which you use in training

bash get_data/mkdir_train_val_img.sh ${BDD100k-Path}

finally, create images using multiprocessing

Ex. create 1900/process using 37 process(machine)

1st node

bash get_data/create_img.sh ${BDD100k-Path} 1 1900

2nd node

bash get_data/create_img.sh ${BDD100k-Path} 1901 1900

:
n-th node

bash get_data/create_img.sh ${BDD100k-Path} (n-1)*1900+1 1900

:
37th node

bash get_data/create_img.sh ${BDD100k-Path} 68401 1900

create dataset using singleprocess

if you don't mind time to create dataset, run following command

bash process_bdd.sh ${BDD100k-Path}

Final data structure

data structure after completing the above instructions

${BDD100k-Path}
 |-- bdd100k
 |   |-- videos # 1.5TB
 |   |   |-- train # 1.3TB
 |   |   |-- val # 184GB
 |   |-- images # 3.5TB
 |   |   |-- train # 3.1TB
 |   |   |-- val # 443GB

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dataset prepareration

BDD100K dataset download and create image from movie

requiment

create dataset using multiprocessing

create dataset using singleprocess

Final data structure

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dataset prepareration

BDD100K dataset download and create image from movie

requiment

create dataset using multiprocessing

create dataset using singleprocess

Final data structure