diff --git a/README.md b/README.md index 531ebfaf..65377aab 100644 --- a/README.md +++ b/README.md @@ -1,17 +1,25 @@ +
+ Stackoverflow Logo +

+ Udacity Logo +

+
+ +- *This project basically testing fork repository during my first time learn at [Udacity](https://www.udacity.com/), and thanks to [jjrunner](https://github.com/jjrunner) for sharing his knowledge.* ### Table of Contents -1. [Installation](#installation) -2. [Project Motivation](#motivation) -3. [File Descriptions](#files) -4. [Results](#results) -5. [Licensing, Authors, and Acknowledgements](#licensing) +- **1.** [Installation](#installation) +- **2.** [Project Motivation](#motivation) +- **3.** [File Descriptions](#files) +- **4.** [Results](#results) +- **5.** [Licensing, Authors, and Acknowledgements](#licensing) -## Installation +## 1. Installation There should be no necessary libraries to run the code here beyond the Anaconda distribution of Python. The code should run with no issues using Python versions 3.*. -## Project Motivation +## 2. Project Motivation For this project, I was interestested in using Stack Overflow data from 2017 to better understand: @@ -24,17 +32,17 @@ For this project, I was interestested in using Stack Overflow data from 2017 to The full set of files related to this course are owned by Udacity, so they are not publicly available here. However, you can see pieces of the analysis here. This README also serves as a template for students to follow in creating their own project README files. -## File Descriptions +## 3. File Descriptions There are 3 notebooks available here to showcase work related to the above questions. Each of the notebooks is exploratory in searching through the data pertaining to the questions showcased by the notebook title. Markdown cells were used to assist in walking through the thought process for individual steps. There is an additional `.py` file that runs the necessary code to obtain the final model used to predict salary. -## Results +## 4. Results The main findings of the code can be found at the post available [here](https://medium.com/@josh_2774/how-do-you-become-a-developer-5ef1c1c68711). -## Licensing, Authors, Acknowledgements +## 5. Licensing, Authors, Acknowledgements Must give credit to Stack Overflow for the data. You can find the Licensing for the data and other descriptive information at the Kaggle link available [here](https://www.kaggle.com/stackoverflow/so-survey-2017/data). Otherwise, feel free to use the code here as you would like! diff --git a/Salary_ML.py b/Salary_ML.py index a916a9e0..e034f59f 100644 --- a/Salary_ML.py +++ b/Salary_ML.py @@ -44,7 +44,7 @@ def clean_data(df): X = df return X, y -def find_optimal_lm_mod(X, y, cutoffs, test_size = .30, random_state=42, plot=True): +def find_optimal_lm_mod(X, y, cutoffs, test_size = .25, random_state=42, plot=True): ''' INPUT X - pandas dataframe, X matrix