Merge pull request #7 from ControlAI/docs/initial-docs

docs: concepts initial draft
willbakst · Nov 15, 2023 · b97bd38 · b97bd38
2 parents 97b18b1 + 9e4a4e8
commit b97bd38
Show file tree

Hide file tree

Showing 6 changed files with 171 additions and 5 deletions.
diff --git a/docs/concepts/calibrators.md b/docs/concepts/calibrators.md
@@ -1,3 +1,27 @@
 # Calibrators
 
-Coming soon...
+Calibrators are one of the core concepts of the PyTorch Lattice library. The library currently implements two types of calibrators:
+
+- [`CategoricalCalibrator`](/pytorch-lattice/api/layers/#pytorch_lattice.layers.CategoricalCalibrator): calibrates a categorical value through a mapping from a category to a learned value.
+- [`NumericalCalibrator`](/pytorch-lattice/api/layers/#pytorch_lattice.layers.NumericalCalibrator): calibrates a numerical value through a learned piece-wise linear function.
+
+Categorical Calibrator          | Numerical Calibrator
+:------------------------------:|:----------------------------------------:
+![](../img/thal_calibrator.png) | ![](../img/hours_per_week_calibrator.png)
+
+## Feature Calibrators
+
+In a [calibrated model](model_types.md), the first layer is the calibration layer that calibrates each feature using a calibrator that's learned per feature.
+
+There are three primary benefits to using feature calibrators:
+
+- Automated Feature Pre-Processing. Rather than relying on the practitioner to determine how to best transform each feature, feature calibrators learn the best transformations from the data.
+- Additional Interpretability. Plotting calibrators as bar/line charts helps visualize how the model is understanding each feature. For example, if two input values for a feature have the same calibrated value, then the model considers those two input values equivalent with respect to the prediction.
+- [Shape Constraints](shape_constraints). Calibrators can be constrained to guarantee certain expected input/output behavior. For example, you might a monotonicity constraint on a feature for square footage to ensure that increasing square footage always increases predicted price. Or perhaps you want a concavity constraint such that increasing a feature for price first increases and then decreases predicted sales.
+
+## Output Calibration
+
+You can also use a `NumericalCalibrator` as the final layer for a model, which is called output calibration. This can provide additional flexibility to the overall model function.
+
+Furthermore, you can use an output calibrator for post-training distribution matching to calibrate your model to a new distribution without retraining the rest of the model.
+
diff --git a/docs/concepts/classifier.md b/docs/concepts/classifier.md
@@ -1,3 +1,106 @@
 # Classifier
 
-Coming soon...
+The [`Classifier`](/pytorch-lattice/api/classifier) class is a high-level wrapper around the calibrated modeling functionality to make it extremely easy to fit a calibrated model to a classification task. The class uses declarative configuration and automatically handles the data preparation, feature configuration, model creation, and model training necessary for properly training a calibrated model.
+
+## Initialization
+
+The only required parameter for creating a classifier is the list of features to use:
+
+```py
+clf = pyl.Classifier(["list", "of", "features"])
+```
+
+You do not need to include all of the feature present in your dataset. When you specify only a subset of the features, the classifier will automatically handle selecting only those features for training.
+
+## Fitting
+
+Fitting the classifier to your data is as simple as calling `fit(...)`:
+
+```py
+clf.fit(X, y)
+```
+
+You can additionally further specify hyperparameters used for fitting such as `epochs`, `batch_size`, and `learning_rate`. Just pass the values in as parameters:
+
+```py
+clf.fit(X, y, epochs=100, batch_size=512, learning_rate=1e-4)
+```
+
+When you call fit, the classifier will train a new model, overwriting any previously trained model. If you want to run a hyperparameter optimization job to find the best setting of hyperparameters, you can first extract the trained model before calling fit again:
+
+```py
+models = []
+for epochs, batch_size, learning_rate in hyperparameters:
+    clf.fit(X, y, epochs=epochs, batch_size=batch_size, learning_rate=learning_rate)
+    models.append(clf.model)
+```
+
+The benefit of extracting the model is that you can reuse the same classifier configuration; however, you can also always create a new classifier for each setting instead:
+
+```py
+clfs = []
+for epochs, batch_size, learning_rate in hyperparameters:
+    clf = pyl.Classifier(X.columns).fit(
+        X, y, epochs=epochs, batch_size=batch_size, learning_rate=learning_rate
+    )
+    clfs.append(clf)
+```
+
+## Generate Predictions
+
+You can generate predictions using the `predict(...)` function:
+
+```py
+probabilities = clf.predict(X)
+logits = clf.predict(X, logits=True)
+```
+
+Just make sure that the input `pd.DataFrame` contains all of the features the classifier is expecting.
+
+## Model Configuration
+
+To configure the type of calibrated model to use for the classifier, you can additionally provide a model configuration during initialization:
+
+```py
+model_config = pyl.model_configs.LinearConfig(use_bias=False)
+clf = pyl.Classifier(["list", "of", "features"], model_config)
+```
+
+See [Model Types](model_types.md) for more information on the supported model types and [model_configs](/pytorch-lattice/api/model_configs) for more information on configuring these models in a classifier. 
+
+## Feature Configuration
+
+When you first initialize a calibrator, all features will be initialized using default values. You can further specify configuration options for features by retrieve the feature's configuration from the classifier and calling the corresponding function to set that option:
+
+```py
+clf.configure("feature").monotonicity("increasing").num_keypoints(10)
+```
+
+See [feature_configs](pytorch-lattice/api/feature_config/) for all of the available configuration options.
+
+## Categorical Features
+
+If the value type for a feature in the dataset is not numerical (e.g. string), the classifier will automatically handle the feature as categorical, using all unique categories present in the dataset as the categories for the calibrator.
+
+If you want the classifier to handle a discrete numerical value as a categorical feature, simply convert the values to strings:
+
+```py
+X["categorical_feature"] = X["categorical_feature"].astype(str)
+```
+
+Additionally you can specify a list of categories to use as a configuration option:
+
+```py
+clf.configure("categorical_feature").categories(["list", "of", "categories"])
+```
+
+Any category in the dataset that is not present in the configured category list will be lumped together into a missing category bucket, which will also have a learned calibration. This can be particularly useful if there are categories in your dataset that appear in very few examples.
+
+## Saving & Loading
+
+The `Classifier` class also provides easy save/load functionality so that you can save your classifiers and load them as necessary to generate predictions:
+
+```py
+clf.save("path/to/dir")
+loaded_clf = pyl.Classifier.load("path/to/dir")
+```
diff --git a/docs/concepts/model_types.md b/docs/concepts/model_types.md
@@ -1,3 +1,7 @@
 # Model Types
 
-Coming soon...
+The PyTorch Lattice library currently supports two types of calibrated modeling:
+
+- [`CalibratedLinear`](/pytorch-lattice/api/models/#pytorch_lattice.models.CalibratedLinear): a calibrated linear model combines calibrated features using a standard [linear](/pytorch-lattice/api/layers/#pytorch_lattice.layers.Linear) layer, optionally followed by an output calibrator.
+
+- [`CalibratedLattice`](/pytorch-lattice/api/models/#pytorch_lattice.models.CalibratedLattice): a calibrated lattice model combines calibrated features using a [lattice](/pytorch-lattice/api/layers/#pytorch_lattice.layers.Lattice) layer, optionally followed by an output calibrator. The lattice layer can learn higher-order feature interactions, which can help increase model flexibility and thereby performance on more complex prediction tasks.
diff --git a/docs/concepts/plotting.md b/docs/concepts/plotting.md
@@ -1,3 +1,28 @@
 # Plotting
 
-Coming soon...
+The `plots` module provides useful plotting utility functions for visualizing calibrated models.
+
+## Feature Calibrators
+
+For any calibrated model, you can plot feature calibrators. The plotting utility will automatically determine the feature type and generate the corresponding calibrator visualization:
+
+```py
+pyl.plots.calibrator(clf.model, "feature")
+```
+
+Categorical Calibrator          | Numerical Calibrator
+:------------------------------:|:----------------------------------------:
+![](../img/thal_calibrator.png) | ![](../img/hours_per_week_calibrator.png)
+
+
+The `calibrator(...)` function expects a calibrated model as the first argument so that you can use these functions even if you train a calibrated model manually without the `Classifier` class.
+
+## Linear Coefficients
+
+For calibrated linear models, you can also plot the linear coefficients as a bar chart to better understand how the model is combining calibrated feature values:
+
+```py
+pyl.plots.linear_coefficients(clf.model)
+```
+
+![](../img/linear_coefficients.png)
diff --git a/docs/concepts/shape_constraints.md b/docs/concepts/shape_constraints.md
@@ -1,3 +1,13 @@
 # Shape Constraints
 
-Coming soon...
+Shape constraints play a crucial role in making calibrated models interpretable by allowing users to impose specific behavioral rules on their machine learning models. These constraints help to reduce – or even eliminate – the impact of noise and inherent biases contained in the data.
+
+Monotonicity constraints ensure that the relationship between an input feature and the output prediction consistently increases or decreases. Let's consider our house price prediction task once more. A monotonic constraint on the square footage feature would guarantee that increasing the size of the property increases the predicted price. This makes sense.
+
+Unimodality constraints create a single peak in the model's output, ensuring that there is only one optimal value for a given input feature. For example, a feature for price used when predicting sales volume may be unimodal since lower prices generally lead to higher sales, but prices that are too low may indicate low quality.
+
+Trust constraints define the relative importance of input features depending on other features. For instance, a trust constraint can ensure that a model predicting product sales relies more on the star rating (1-5) when the number of reviews is higher, which forces the model's predictions to better align with real-world expectations and rules.
+
+Together, these shape constraints help create machine learning models that are both interpretable and trustworthy.
+
+The library currently implements the [`Monotonicity`](/pytorch-lattice/api/enums/#pytorch_lattice.enums.Monotonicity) shape constraint, but we are working on releasing additional constraints soon.
diff --git a/docs/img/linear_coefficients.png b/docs/img/linear_coefficients.png