Skip to content

Latest commit

 

History

History
69 lines (51 loc) · 1.83 KB

README.md

File metadata and controls

69 lines (51 loc) · 1.83 KB

Sentiment Analysis


Dataset :

> Dataset Download Reference Link : http://ai.stanford.edu/~amaas/data/sentiment/
> Large Movie Review Dataset
> This Dataset contains 25,000 highly polar movie reviews for training, and 25,000 for testing.

Pre-Processing :

> Convert to Lower Case
> Removed Punctuations
> Removed Extra Spaces
> Limited the Length of Sentance within 30 - 200 Words
> Mapped Words to Glove vectors
> Used PorterStemmer if word not found in Glove file

Model :

> Simple 2 Layer LSTM Model implemented in Tensorflow 1.0
> Dynamic RNN with LSTM Cell of Dimenssion 50 & Final Dense layer with 2 Units
> Total Num of Parameters :
> OPS per Word :

Training :

> Optimizer used for Training the Model : ADAM
> Used default learning rate
> Loss function : softmax_cross_entropy

Setup :

1. Dowload the Dataset & Glove Matrix :

wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz -O aclImdb_v1.tar.gz
tar -xvzf aclImdb_v1.tar.gz
wget http://nlp.uoregon.edu/download/embeddings/glove.6B.50d.txt -O glove.6B.50d.txt

2. Install Tensorflow==1.15.0 and other Python helper modules :

python3 -m pip install --upgrade pip
pip3 install tensorflow
pip3 install nltk
pip install pandas

Run the Training Notebook : Text_Sentiment_Analysis.ipynb


Quality Metrics :

> Loss :
> Accuracy :
> Epochs :

For Inference, look into the Notebook : Text_Sentiment_Inference.ipynb