- databases:
- Adult Data Set: Predict whether income exceeds $50K/yr based on census data. Also known as "Census Income" dataset.
Link: https://archive.ics.uci.edu/ml/datasets/Adult
A good reference for this dataset: https://github.com/PAIR-code/facets (see Facets Dive).
- THE MNIST DATABASE of handwritten digits.
Link:: http://yann.lecun.com/exdb/mnist/
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. "Gradient-based learning applied to document recognition." Proceedings of the IEEE, 86(11):2278-2324, November 1998.
-
DTCPY: Decision tree classifier write from scratch in Python 3 using Jupyter Notebook.
-
RFCPY: Random forest classifier write from scratch in Python 3 using Jupyter Notebook.
-
RFRPY: Random forest regressor with SciKit-Learn and write from scratch in Python 3 using Jupyter Notebook. Based on Random Forest in Python A Practical End-to-End Machine Learning Example (William Koehrsen). Link:: https://towardsdatascience.com/random-forest-in-python-24d0893d51c0
-
RFCLP: Random forest classifier write from scratch in Lisp. With pruning step and quantization of numeric features in feature space.
-
RFRLP: Random forest regressor write from scratch in Lisp. Using a improved version of the algorthm from RFRPY.
license: BSD 3-Clause License. Copyright (c) 2018, Israel Gonçalves de Oliveira. All rights reserved.
PhD dissertation, Gilles Louppe, July 2014. Defended on October 9, 2014.
arXiv: http://arxiv.org/abs/1407.7502
Mirrors:
License: BSD 3 clause
Contact: Gilles Louppe (@glouppe, [email protected])
Please cite using the following BibTex entry:
@phdthesis{louppe2014understanding,
title={Understanding Random Forests: From Theory to Practice},
author={Louppe, Gilles},
school={University of Liege, Belgium},
year=2014,
month=10,
note={arXiv:1407.7502}
}
-
A good article about Random Forest and feature importance: How not to use random forest.
-
An Implementation and Explanation of the Random Forest in Python.
-
I strongly recommend this one first: The Simple Math behind 3 Decision Tree Splitting criterions.