Problem description

The sales team has concerns about ongoing product sales in one of the stores. Specifically, they asked us to redo our previous sales prediction analysis, but this time including the ‘product type’ attribute in predictions to understand how specific product types perform against each other. This will help the sales team better understand how types of products might impact sales across the enterprise.

Assignment objectives

• Predicting sales of four different product types: PC, Laptops, Netbooks and Smartphones

• Assessing the impact services reviews and customer reviews have on sales of different product types o A chart that displays the impact of customer and service reviews have on sales volume

Key Findings

• Product category is not a relevant variable to predict sales volume.

• 4 star review and positive service review are the best variables to predict sales volume.

• New potential products have price conflict with the existing ones.

• As the distribution of sales volume is not normal and the sample is small, the predictions in this report are limited in scope and should be taken just as a reference. In order to have a powerful decision making, it should be combine with a descriptive analysis and a better understanding of the market.

Preprocessing

Impact of product type to predict volume

Before including product type in the variables to predict sales volume, some tests can confirm if the two variables are indeed correlated. The below shows the correlation of the dummy product categories against sales volume. The first conclusion is that there is no correlation, with exception to Game Console. This high correlation is biased as there are only two observations of this product in the existing product list.

Table: Correlation between product type and volume

Also, running a simple decision tree to predict volume, it shows that 4 star review and Positive Review are the most reliable variables, and not the product category.

Decision tree: Relevant variables to predict sales volume

Attributes correlation

5 star review has a perfect correlation to volume. As this is in practice impossible, the former was considered wrong and excluded. Also, attributes highly correlated (above 0.85) represents collinearity and can create noise to the model.

The remaining indicators with high correlation to sales volume are: 4 star review and Positive Service Review.

Outliers treatment

There are two outliers, which were excluded from the model as they don’t represent the standard sales behavior.

Graph: Distribution of sales volume

Creation of training and testing sets

As the distribution of the sales volume is not normal, the split between training and testing sets could strongly impact and jeopardize the results. Using the Set.seed() function, several random numbers were tested in order to have a similar training and testing distributions. The final data is displayed below, which still contain differences, but at a low level.

Graph: Distribution of training set

Graph: Distribution of testing set

Modeling

Running the models in the training and testing sets, the results are summarized below (direct export from R in appendix):

Table: R2 per method

Method	R2 – Model	R2 - Predicted	Average R2
Random Forest	0.935	0.784	0.8595
Gradient Boosted Machine	0.826	0.790	0.808
Support Vector Machine	0.768	0.721	0.7445
Knn	0.924	0.630	0.777
Linear	0.797	0.729	0.763

The two best models were selected according to the average R2 (Random Forest and Gradient Boosted Machine). They were both applied to the new product list in order to predict sales volume. The final sales volume is an average of both model results.

Table: Rank of best products

Rank	Product Type	Product Num	Price	x4StarReviews	Positive ServiceReview	Volume - rf	Volume - gbm	Volume - avg
1	Netbook	180	329	112	28	1,114	1,189	1,152
2	Smartphone	194	49	26	14	610	891	750
3	PC	171	699	26	12	467	811	639
4	Laptop	173	1,199	10	11	197	803	500
5	Smartphone	193	199	26	8	311	401	356
6	PC	172	860	11	7	118	167	143
7	Smartphone	196	300	19	5	141	41	91
8	Smartphone	195	149	8	4	79	82	81
9	Netbook	181	439	18	5	117	32	74
10	Netbook	178	400	8	2	47	54	50
11	Laptop	175	1,199	2	2	38	42	40
12	Netbook	183	330	4	1	34	39	36
13	Laptop	176	1,999	1	-	6	39	23

Descriptive analysis on product type

Graph: Price comparison of new vs existing products

Looking at the graph above, it seems that Blackwell is adding products in the same price range than the current portfolio. The management should assess if it will give more options to clients and increase sales or if it will just create brand cannibalization. On the other hand, when looking at the laptop category, the new product is substantially more expensive than the average. The company should analyze its positioning and evaluate if customers are indeed looking for higher quality product for a higher price.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.gitignore		.gitignore
Prediction_gbm.xlsx		Prediction_gbm.xlsx
Prediction_rf.xlsx		Prediction_rf.xlsx
README.md		README.md
Task_3_v1.R		Task_3_v1.R
existing_and_new_productattributes_v2.xlsx		existing_and_new_productattributes_v2.xlsx
existingproductattributes_v1.xlsx		existingproductattributes_v1.xlsx
newproductattributes_v2.xlsx		newproductattributes_v2.xlsx
predicting-sales.Rproj		predicting-sales.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Problem description

Assignment objectives

Key Findings

Preprocessing

Impact of product type to predict volume

Attributes correlation

Outliers treatment

Creation of training and testing sets

Modeling

Descriptive analysis on product type

About

Releases

Packages

Languages

Gabrielcidral1/predicting-sales

Folders and files

Latest commit

History

Repository files navigation

Problem description

Assignment objectives

Key Findings

Preprocessing

Impact of product type to predict volume

Attributes correlation

Outliers treatment

Creation of training and testing sets

Modeling

Descriptive analysis on product type

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages