Hopefully you’re using virtualenvwrapper, make a new project and clone this repo into the venv project directory.
mkproject datascibowl -p /usr/bin/python2.7
This has only been tested with python 2.7, using it as the python in the venv is therefore recommended.
The contents of this project are something like this:
ls -l
total 252 drwxr-xr-x 4 jotham users 4096 Feb 24 07:01 data -rw-r--r-- 1 jotham users 117 Feb 22 17:10 data_download.txt drwxr-xr-x 3 jotham users 4096 Mar 4 20:09 ipynb_tests drwxr-xr-x 3 jotham users 4096 Mar 3 22:01 ipynb_work -rw-r--r-- 1 jotham users 185519 Mar 3 22:10 notes.org drwxr-xr-x 8 jotham users 4096 Feb 22 17:33 pylearn2_fork -rw-r--r-- 1 jotham users 566 Feb 22 17:32 requirements.txt -rw-r--r-- 1 jotham users 613 Feb 22 17:58 theano_gpu_test.py -rw-r--r-- 1 jotham users 1009 Feb 22 20:35 theano_gpu_test.pyc -rw-r--r-- 1 jotham users 31011 Feb 23 20:59 theano_test.txt
ipynb_tests
has some gross hackery and early model tests.
ipynb_work
has scripts that should stand on their own for
fitting multilayer perceptron and convolutional neural network.
data
directory should contain data you want to use with
pylearn. E.g.
find ./data -maxdepth 2 -type d -ls
524382 4 drwxr-xr-x 4 jotham users 4096 Feb 24 07:01 ./data 1183533 4 drwxr-xr-x 4 jotham users 4096 Feb 24 07:02 ./data/datascibowl 1451882 4 drwxr-xr-x 123 jotham users 4096 Feb 22 17:25 ./data/datascibowl/train 1315239 3864 drwxr-xr-x 2 jotham users 3956736 Feb 22 17:23 ./data/datascibowl/test 1050702 4 drwxr-xr-x 2 jotham users 4096 Feb 22 17:12 ./data/mnist
Put the data in there as shown above. The class for the
datascibowl data requires the PYLEARN2_DATA_PATH
environment var. Enforce it however you like. I opt to put
it in ipython notebooks using
import os
os.environ["PYLEARN2_DATA_PATH"] = "/home/jotham/projects/2014-12-20_datascibowl/data"
But you could make a venv hook or something.
Required packages are in requirements.txt
.
cat requirements.txt
Cython==0.22 Jinja2==2.7.3 MarkupSafe==0.23 Pillow==2.7.0 PyYAML==3.11 Theano==0.6.0 argparse==1.3.0 backports.ssl-match-hostname==3.4.0.2 certifi==14.05.14 ipython==2.3.1 matplotlib==1.4.2 mock==1.0.1 nose==1.3.4 numexpr==2.4 numpy==1.9.1 pandas==0.15.2 pyaml==14.12.10 -e [email protected]:jo-tham/pylearn2.git@6a7d018b4c388617df57244e7df0b825839a1329#egg=pylearn2-origin/datascibowl pyparsing==2.0.3 python-dateutil==2.4.0 pytz==2014.10 pyzmq==14.4.1 scikit-image==0.10.1 scikit-learn==0.15.2 scipy==0.14.1 six==1.9.0 tables==3.1.1 tornado==4.0.2 wsgiref==0.1.2
If you want to contribute to the pylearn fork, remove it from requirements.txt and install using following instructions. Install everything else into the venv.
pip install -r requirements.txt
Clone the pylearn fork at jo-tham/pylearn2. Checkout the datascibowl branch. Install it into the active venv with.
cd pylearn2
python setup.py develop
Try running the contents of the ipython notebooks in
ipynb_work
.