Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added notebook for sentiment analysis #35

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vinaybagade
Copy link

No description provided.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@vinaybagade
Copy link
Author

cc @VibhuJawa

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:11Z
----------------------------------------------------------------

Line #1.    !pip install hvplot

Might be worth commenting this out to keep the notebook clean in a run.

Also worth adding s3fs. 

Line #3. !!pip install s3fs



@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:11Z
----------------------------------------------------------------

This code adds complexity.

I suggest switching to below, Its much faster and only uses cudf . Also might be useful to switch to below, it only takes 30 seconds.

%%time
input_bucket = 's3://amazon-reviews-pds'
input_path = '/parquet/product_category=Office_Products/*.parquet'

df = cudf.read_parquet(input_bucket+input_path,
                       storage_options={'anon': True},
                       columns = ['star_rating','review_body'])



@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:12Z
----------------------------------------------------------------

Add a markdown section explaing that you are merging reviews here.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:13Z
----------------------------------------------------------------

Add a section here explaining that you are gonna be tokenizing here.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:13Z
----------------------------------------------------------------

Add a section to say you are starting a Data Loader


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:14Z
----------------------------------------------------------------

Line #2.    from transformers import BertModel

Mark Down section explaining that you are creating a model here.


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:14Z
----------------------------------------------------------------

Line #3.    def train_model(model, data_loader, loss_fn, optimizer, scheduler, n_examples):

Add a markdown section explaining training loop. .


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

VibhuJawa commented on 2021-09-01T23:16:15Z
----------------------------------------------------------------

Line #1.    ## Use some custom examples

Add a markdown section with custom examples.


@VibhuJawa VibhuJawa self-requested a review September 1, 2021 23:16
Copy link
Member

@VibhuJawa VibhuJawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @vinaybagade . Thanks a lot for working on this. The example is really useful across board.

I have requested some changes which you can view here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants