Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docstring changes for memmodel/preprocessing #791

Merged
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 194 additions & 14 deletions verticapy/machine_learning/memmodel/preprocessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

class Scaler(InMemoryModel):
"""
InMemoryModel implementation of scalers.
:py:mod:`verticapy.machine_learning.memmodel.base.InMemoryModel` implementation of scalers.

Parameters
----------
Expand Down Expand Up @@ -100,14 +100,73 @@ def transform_sql(self, X: ArrayLike) -> list[str]:

class StandardScaler(Scaler):
"""
InMemoryModel implementation of standard scaler.
:py:mod:`verticapy.machine_learning.memmodel.base.InMemoryModel` implementation of standard scaler.

Parameters
----------
mean: ArrayLike
Model's features averages.
std: ArrayLike
Model's features standard deviations.

.. note:: :py:mod:`verticapy.machine_learning.memmodel` are defined
entirely by their attributes. For example, 'mean',
and 'standard deviation' of feature(S) define a StandardScaler model.

Examples
--------

**Initalization**

Import the required module.

.. ipython:: python

from verticapy.machine_learning.memmodel.preprocessing import StandardScaler

A StandardScaler model is defined by mean and standard deviation values. In this example, we will use the following:

.. ipython:: python

mean = [0.4, 0.1]
std = [0.5, 0.2]

Let's create a :py:mod:`verticapy.machine_learning.memmodel.preprocessing.StandardScaler` model.

.. ipython:: python

model_sts = StandardScaler(mean, std)

Create a dataset.

.. ipython:: python

data = [[0.45, 0.17]]

**Making In-Memory Transformation**

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.StandardScaler.transform` method to do transformation

.. ipython:: python

model_sts.transform(data)

**Deploy SQL Code**

Let's use the following column names:

.. ipython:: python

cnames = ['col1', 'col2']

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.StandardScaler.transform_sql`
method to get the SQL code needed to deploy the model using its attributes

.. ipython:: python

model_mms.transform_sql(cnames)

.. hint:: This object can be pickled and used in any in-memory environment, just like `SKLEARN <https://scikit-learn.org/>`_ models.
"""

# Properties.
Expand All @@ -125,14 +184,74 @@ def __init__(self, mean: ArrayLike, std: ArrayLike) -> None:

class MinMaxScaler(Scaler):
"""
InMemoryModel implementation of MinMax scaler.
:py:mod:`verticapy.machine_learning.memmodel.base.InMemoryModel` implementation of MinMax scaler.

Parameters
----------
min_: ArrayLike

min\_: ArrayLike
Model's features minimums.
max_: ArrayLike
max\_: ArrayLike
Model's features maximums.

.. note:: :py:mod:`verticapy.machine_learning.memmodel` are defined
entirely by their attributes. For example, 'minimum',
and 'maximum' values of feature(S) defines a MinMaxScaler model.
oualib marked this conversation as resolved.
Show resolved Hide resolved

Examples
--------

**Initalization**

Import the required module.

.. ipython:: python

from verticapy.machine_learning.memmodel.preprocessing import MinMaxScaler

A MinMaxScaler model is defined by minimum and maximum values. In this example, we will use the following:
oualib marked this conversation as resolved.
Show resolved Hide resolved

.. ipython:: python

min = [0.4, 0.1]
max = [0.5, 0.2]

Let's create a :py:mod:`verticapy.machine_learning.memmodel.preprocessing.MinMaxScaler` model.

.. ipython:: python

model_mms = MinMaxScaler(min, max)

Create a dataset.

.. ipython:: python

data = [[0.45, 0.17]]

**Making In-Memory Transformation**

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.MinMaxScaler.transform` method to do transformation

.. ipython:: python

model_mms.transform(data)

**Deploy SQL Code**

Let's use the following column names:

.. ipython:: python

cnames = ['col1', 'col2']

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.MinMaxScaler.transform_sql`
method to get the SQL code needed to deploy the model using its attributes

.. ipython:: python

model_mms.transform_sql(cnames)

.. hint:: This object can be pickled and used in any in-memory environment, just like `SKLEARN <https://scikit-learn.org/>`_ models.
"""

# Properties.
Expand All @@ -150,26 +269,87 @@ def __init__(self, min_: ArrayLike, max_: ArrayLike) -> None:

class OneHotEncoder(InMemoryModel):
"""
InMemoryModel implementation of one-hot encoder.
:py:mod:`verticapy.machine_learning.memmodel.base.InMemoryModel` implementation of one-hot encoder.

Parameters
----------

categories: ArrayLike
ArrayLike of the categories of the different features.
column_naming: str, optional
Appends categorical levels to column names according
to the specified method:
indices : Uses integer indices to represent
categorical levels.
values/values_relaxed: Both methods use categorical level
names. If duplicate column names
occur, the function attempts to
disambiguate them by appending _n,
where n is a zero-based integer
index (_0, _1,…).

- indices : Uses integer indices to represent categorical levels.

- values/values_relaxed: Both methods use categorical level names.
If duplicate column names occur, the function attempts to
disambiguate them by appending _n, where n is a zero-based integer index (_0, _1,…).

drop_first: bool, optional
If set to False, the first dummy of each category is
dropped.

.. note:: :py:mod:`verticapy.machine_learning.memmodel` are defined
entirely by their attributes. For example, 'categories' to encode
defines a OneHotEncoder model. You can optionally provide 'column naming'
criteria and a 'drop_first' flag to denote whether to drop first dummy of each category.

Examples
--------

**Initalization**

Import the required module.

.. ipython:: python

from verticapy.machine_learning.memmodel.preprocessing import OneHotEncoder

A OneHotEncoder model is defined by categories, column naming criteria and drop_first flag. In this example, we will use the following:

.. ipython:: python
oualib marked this conversation as resolved.
Show resolved Hide resolved

categories = [["male", "female"], [1, 2, 3]]
drop_first = False
column_naming = None

Let's create a :py:mod:`verticapy.machine_learning.memmodel.preprocessing.OneHotEncoder` model.

.. ipython:: python

model_ohe = OneHotEncoder(categories, drop_first, column_naming)
oualib marked this conversation as resolved.
Show resolved Hide resolved

Create a dataset.

.. ipython:: python

data = [["male", 1], ["female", 3]]

**Making In-Memory Transformation**

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.OneHotEncoder.transform` method to do transformation

.. ipython:: python

model_ohe.transform(data)

**Deploy SQL Code**

Let's use the following column names:

.. ipython:: python

cnames = ['sex', 'pclass']

Use :py:meth:`verticapy.machine_learning.memmodel.preprocessing.OneHotEncoder.transform_sql`
method to get the SQL code needed to deploy the model using its attributes

.. ipython:: python

model_ohe.transform_sql(cnames)

.. hint:: This object can be pickled and used in any in-memory environment, just like `SKLEARN <https://scikit-learn.org/>`_ models.
"""

# Properties.
Expand Down