Boston Housing Analysis is a hands-on project I finished to get IBM certification for "Statistics for Data Science with Python".
You are a Data Scientist with a housing agency in Boston MA, you have been given access to a previous dataset on housing prices derived from the U.S. Census Service to present insights to higher management. Based on your experience in Statistics, what information can you provide them to help with making an informed decision? Upper management would like to get some insight into the following.
- Is there a significant difference in the median value of houses bounded by the Charles river or not?
- Is there a difference in median values of houses of each proportion of owner-occupied units built before 1940?
- Can we conclude that there is no relationship between Nitric oxide concentrations and the proportion of non-retail business acres per town?
- What is the impact of an additional weighted distance to the five Boston employment centres on the median value of owner-occupied homes?
- Task 1: Load the dataset in a Jupyter Notebook using IBM Watson Studio.
- Task 2: Generate descriptive statistics and visualizations for upper management (using seaborn library to create boxplot/bar plot/scatter plot/histogram).
- Task 3: Use the appropriate tests to answer the questions provided (T-test for independent samples, ANOVA, Pearson Correlation, Regression Analysis).