Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data needs more cleaning #1

Open
snowch opened this issue Jan 8, 2018 · 0 comments
Open

Data needs more cleaning #1

snowch opened this issue Jan 8, 2018 · 0 comments

Comments

@snowch
Copy link
Member

snowch commented Jan 8, 2018

Looking at the description, it appears that they have discarded the stock. I think it doesn't have a 'C' because these were never part of a customer order. This is my best guess though :)

InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice CustomerID Country
556690 23005 printing smudges/thrown away -9600 14/06/2011 10:37 0   United Kingdom
556691 23005 printing smudges/thrown away -9600 14/06/2011 10:37 0   United Kingdom

Now that I've had a better look through the data, I think I should ignore rows with negative values where the Invoice number doesn't start with 'C'. I think I should probably ignore all records that have a UnitPrice of 0.

Also, where there is a 'C' record, I think I should ignore the corresponding record for the invoice like here:

InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice CustomerID Country
C581484 23843 PAPER CRAFT , LITTLE BIRDIE -80995 09/12/2011 09:27 2.08 16446 United Kingdom
581483 23843 PAPER CRAFT , LITTLE BIRDIE 80995 09/12/2011 09:15 2.08 16446 United Kingdom

I have a feeling that the above order may have been cancelled because the quantity entered was a mistake. It looks like the same may have been the case for:

InvoiceNo StockCode Description Quantity InvoiceDate UnitPrice CustomerID Country
C541433 23166 MEDIUM CERAMIC TOP STORAGE JAR -74215 18/01/2011 10:17 1.04 12346 United Kingdom
541431 23166 MEDIUM CERAMIC TOP STORAGE JAR 74215 18/01/2011 10:01 1.04 12346 United Kingdom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant