movie-predictor

Use regression to predict potential success of low-budget films based on film characteristics. What factors drive success and how are these factors different for success as defined by revenues vs. success as defined by recognition (Oscar nominations?).

For more information, see my blog post.

in this repo

movie-scraper.py scrapes data for all movies (16,100+) on boxofficemojo.com
oscar-scraper.py scrapes Oscars data from Newsday
movie-predictor-analysis.ipynb iPython notebook with regression analysis
data/ contains data (pkl files and CPI csv) used in iPython notebook analysis; also the default subdir where pkl files will be saved if scraping scripts are run
presentation/ contains pdf presentation of findings & recommendations

installation

clone this repo

$ git clone https://github.com/dianalam/movie-predictor.git

dependencies

Scripts were written in Python 2.7. You'll need the following modules:

matplotlib >= 1.5.1  
numpy >= 1.10.1  
pandas >= 0.17.1  
python-dateutil >= 2.4.2
scipy >= 0.16.0
seaborn >= 0.6.0
sklearn >= 0.17
statsmodels >= 0.6.1

To install modules, run:

$ pip install <module>

running

Scrape boxofficemojo. Note that script comes with the option to get a text alert via Twilio once script is done running. To use, you'll need to pass your Twilio Account SID, Auth Token, and phone number (with +1 at the beginning) as environment variables.

$ TWILIO_SID = <my-account-sid> TWILIO_TOKEN = <my-auth-token> PHONE_NUM = <my-phone-number> python movie-scraper.py

Scrape Newsday

$ python oscar-scraper.py

View/run regression analysis (cells already executed in file)

ipython notebook movie-predictor-analysis.ipynb

data sources

Thanks to:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

movie-predictor

in this repo

installation

clone this repo

dependencies

running

data sources

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
data		data
presentation		presentation
.gitignore		.gitignore
README.md		README.md
movie-predictor-analysis.ipynb		movie-predictor-analysis.ipynb
movie-scraper.py		movie-scraper.py
oscar-scraper.py		oscar-scraper.py

dianalam/movie-predictor

Folders and files

Latest commit

History

Repository files navigation

movie-predictor

in this repo

installation

clone this repo

dependencies

running

data sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages