This repository contains code exercises from Chapter 2 of the book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems" by Aurélien Géron. The exercise focuses on predicting housing prices in California using various machine learning regression models.
This Jupyter notebook (california_housing_prices.ipynb
) explores the California housing dataset, aiming to predict median house prices. It covers data analysis, preprocessing, model selection, evaluation, and fine-tuning using Python libraries such as Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.
- Pandas
- NumPy
- Matplotlib
- Seaborn
- scikit-learn
- Machine Learning
- Regression Models (Linear Regression, Decision Tree Regressor, Random Forest Regressor)
- Data Visualization
- Feature Engineering
- Data Preprocessing (Imputation, Encoding Categorical Variables, Scaling)
- Cross Validation
- Hyperparameter Tuning (Randomized Search)
- Initial exploration of the dataset, including summary statistics and visualizations.
- Handling missing values (
SimpleImputer
), encoding categorical variables, and scaling features (StandardScaler
).
- Comparison of regression models such as Linear Regression, Decision Tree Regressor, and Random Forest Regressor.
- Utilizing metrics like mean squared error (MSE), root mean squared error (RMSE) to evaluate model performance.
- Using randomized search for hyperparameter tuning to optimize the Random Forest Regressor.
- Determining feature importance to understand variables affecting house prices the most.
- Python 3.x
- Jupyter Notebook
- Pandas
- NumPy
- Matplotlib
- Seaborn
- scikit-learn
- Create a virtualenv
pip install virtualenv
py -m venv hands_on_ml
hands_on_ml\Scripts\activate
- Install necessary libraries
pip install pandas numpy matplotlib seaborn
pip install -U scikit-learn
- Clone the repository and navigate to its directory:
git clone https://github.com/asparmar14/Regression_model.git
cd Regression_model
-
Launch Jupyter Notebook: jupyter notebook california_housing_prices.ipynb
-
Follow the instructions in the notebook to execute and interact with the analysis.
Special thanks to Aurélien Géron for the exercise provided in his book, which serves as the foundation for this repository.
- Anshul Parmar
Feel free to explore the notebook and provide any feedback or suggestions!