Verhulst is a MIT-licensed Python library for evaluating binary logistic regressions fitted with scikit-learn.
scikit-learn takes a machine learning approach to data analysis and executes numerical routines using liblinear return certain intermediate results of the logistic regression fitting. statsmodels takes an econometric approach to data analysis but is not fully compatible with scikit-learn classifiers. Verhulst aims to fill this gap by providing a consistent API to statistical analysis and plotting routines commonly used to evaluate logistic regression fit.
Statistical Analysis
- Pearson Chi-Square Statistic and Deviance
- Hosmer–Lemeshow Tests
- Likelihood Measures
- Summary Measures
- Casewise Statistics
Plotting
- Diagnostic Plots
- Residual Plots
- Goodness-of-Fit Plots
scikit-learn fits logistic regressions using liblinear, which does not return the likelihoods of the null or fitted models. Verhulst therefore omits popular likelihood-based (Cox-Snell, Nagelkerke) and log-likelihood-based (McFadden) summary measures.
Some of the statistical tests and measures implemented in Verhulst are not intuitively easy to explain. Some are outdated and not recommended for general use. Nevertheless, they are included here because they continue to be used.
In general, these statistical tests and measures are most useful when initially building a model. Given their problematic interpretations and potential to mislead, however, many authors discourage routinely including them in published work.
Verhulst supports Python 3.2, 3.3, and 3.4.
pip can install Verhulst from GitHub:
pip install git+git://github.com/rpetchler/verhulst.git
The following packages are dependencies for Verhulst:
In addition, the statsmodels.distributions
module is vendorized.
Documentation is written in reStructured Text and numpydoc and generated
by Sphinx. The Makefile in docs/
contains targets for HTML and PDF
documentation. The following command generates HTML documentation:
$ make html
Run a local webserver (e.g., python3 -m http.server
) in the directory
docs/_build/html/
in order to view the documentation in a web browser.
[1] | Hosmer, David W., Jr., Stanley Lemeshow, and Rodney X. Sturdivant. Applied Logistic Regression. 3rd ed. New York: Wiley, 2013. |