This is project built as a requirement for course CSE 573 Semantic Web Mining at Arizona State University. The aim is to build directional stock prediction system which predict stock market price movement using online news. It is built using BERT and Bidirectional LSTM.
The model is trained on a news corpus consisting over 80000 articles related to Amazon and Apple news. The dataset was collected between January 2018 to February 2019. The dataset also contains stock chart price of AMZN and AAPL stocks. We build a labelled dataset utilizing change in stock price at different intervals of time.
- Clone the project. Make sure you have python 3.7 installed in your system.
- Open a terminal, and change directory to CODE folder.
- Run "pip install -r requirements.txt" command. This command should install all the python dependencies required to run the project. After installing dependencies. Open Python terminal, import nltk and run nltk.download('punkt')
- Download the dataset from here .You need to request access from us.
- ExtractedNewsSentenceWise.csv contains sentences wise news for Amazon and Apple stock. The charts folder contains the stock price chart. Download the CHARTS Folder and all the CSVs.
- Install Jupyter by running command "pip install jupyter".
- For training BERT, run BertTrain.
- Now you can train either simple models (RandomForest, Decision Trees, Logistic Regression) or Deep Learning Models (B-LSTM).
- For training simple models, run ArticleExtraction.
- For training deep learning models, run DeepLearningModel.
- We have used accuracy metric, F-Score and ROC to evaluate our model. We attained accuracy of 76% using BERT and B-LSTM. The graph plots are included under EVALUATIONS folder.
- TODO: We want to evaluate BERT and LSTM with sentence wise dataset. We also want to curate dataset using co-reference resolution.