You can install all Python requirements by running pip install -r requirements.txt --user
- Python (This is developed using 3.6, so 3.6+ is recommended and may not work with 2.7+)
- SciPy for math functions (probably already installed)
- Keras for image feature extraction
- Scikit-learn for feature reduction (e.g. PCA)
- Multicore-TSNE for converting features to 2 dimensions via TSNE
- RasterFairy for transforming 2D points to grid
- Pillow for image tile generation
You can install a local server by running npm install
then npm start
- Node.js if you'd like to run the interface locally
You can run the full workflow with a single script which will execute each sub-script in the workflow (outlined in the next section)
python run.py \
-id "my_collection" \
-imagedir "path/to/images/*.jpg" \
-tile "128x128" \
-metdata "path/to/metadata.csv"
For the metadata.csv, by default, the script looks for columns "title", "url", and "filename" (the name of the image file with extension). You can pass in custom column names for filename, title, and URL which supports custom formatting
...
-fn "Filename" \
-title "{Name} ({Year})" \
-url "http://www.website.com/{Id}/"
First, given a directory of images, we will extract 4096 features using Keras and the VGG16 model with weights pre-trained on ImageNet, then reduce those to 256 features using PCA, then save those features to a compressed file:
python images_to_features.py \
-in "images/photographic_thumbnails/*.jpg" \
-pca 256 \
-out "output/photographic_features.p.bz2"
Then we will reduce those features even further to just two dimensions using TSNE and output the result to a csv file. You can speed this up by indicated the number of parallel jobs to run, e.g. -jobs 4
python features_to_tsne.py \
-in "output/photographic_features.p.bz2" \
-jobs 4 \
-out "data/photographic_tsne.csv"
Then we will convert those 2D points to a grid assignment using RasterFairy
python tsne_to_grid.py \
-in "data/photographic_tsne.csv" \
-out "output/photographic_grid.p"
Then generate a giant image matrix from the images and the grid data using a 128x128 thumbnail size:
python grid_to_image.py \
-in "output/photographic_grid.p" \
-tile "128x128" \
-out "output/photographic_matrix.jpg"
Finally, we will convert the giant image to tiles (in .dzi format):
python image_to_tiles.py \
-in "output/photographic_matrix.jpg" \
-tsize 128
Then convert metadata .csv to .json for it to be used by the interface. You can pass in custom column names for filename, title, and URL which supports custom formatting
python csv_to_json.py \
-in "data/photographic_images.csv" \
-im "images/photographic_thumbnails/*.jpg" \
-grid "output/photographic_grid.p" \
-fn "Filename" \
-title "{Name} ({Year})" \
-url "http://www.website.com/{Id}/"
You can view the result on a local server by running:
npm install
npm start