Skip to content

Commit

Permalink
correction
Browse files Browse the repository at this point in the history
  • Loading branch information
oualib committed Oct 30, 2024
1 parent fa976a9 commit 295a9ed
Show file tree
Hide file tree
Showing 17 changed files with 51 additions and 34 deletions.
4 changes: 2 additions & 2 deletions docs/source/examples_business_base_station.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Base Station Positions
==================================

This example uses the Telecom Dataset, provided by Shanghai Telecom, to predict the optimal positions for base radio stations.
This dataset contains more than 7.2 million records about people's
Internet access through 3,233 base stations from 9,481 mobile phones
This dataset contains more than ``7.2`` million records about people's
Internet access through ``3,233`` base stations from ``9,481`` mobile phones
over period of six months.

The dataset can be found `here <http://sguangwang.com/TelecomDataset.html>`_. It consists of:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples_business_churn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.
You can skip the below cell if you already have an established connection.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples_business_insurance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.
You can skip the below cell if you already have an established connection.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples_business_spam.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.
You can skip the below cell if you already have an established connection.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples_learn_iris.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.

You can skip the below cell if you already have an established connection.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/examples_learn_pokemon.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.
You can skip the below cell if you already have an established connection.

Expand Down
8 changes: 4 additions & 4 deletions docs/source/examples_learn_titanic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ This example uses the following version of VerticaPy:
vp.__version__
Connect to Vertica. This example uses an existing connection called "VerticaDSN".
Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` .
For details on how to create a connection, see the :ref:`connection` tutorial.

You can skip the below cell if you already have an established connection.
Expand Down Expand Up @@ -69,9 +69,9 @@ Let's explore the data by displaying descriptive statistics of all the columns.
.. raw:: html
:file: SPHINX_DIRECTORY/figures/examples_titanic_table_describe.html

The columns "body" (passenger ID), "home.dest" (passenger origin/destination), "embarked" (origin port) and "ticket" (ticket ID) shouldn't influence survival, so we can ignore these.
The columns ``body`` (passenger ID), ``home.dest`` (passenger origin/destination), ``embarked`` (origin port) and ``ticket`` (ticket ID) shouldn't influence survival, so we can ignore these.

Let's focus our analysis on the columns "name" and "cabin". We'll begin with the passengers' names.
Let's focus our analysis on the columns ``name`` and ``cabin``. We'll begin with the passengers' names.

.. code-block:: python
Expand Down Expand Up @@ -217,7 +217,7 @@ The "sibsp" column represents the number of siblings for each passenger, while t
titanic["family_size"] = titanic["parch"] + titanic["sibsp"] + 1
Let's move on to outliers. We have several tools for locating outliers (:py:mod:`~verticapy.machine_learning.vertica.LocalOutlierFactor`, :py:mod:`~verticapy.machine_learning.vertica.DBSCAN`, :py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans`...), but we'll just use winsorization in this example. Again, "fare" has many outliers, so we'll start there.
Let's move on to outliers. We have several tools for locating outliers (:py:mod:`~verticapy.machine_learning.vertica.LocalOutlierFactor`, :py:mod:`~verticapy.machine_learning.vertica.cluster.DBSCAN`, :py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans`...), but we'll just use winsorization in this example. Again, "fare" has many outliers, so we'll start there.

.. code-block:: python
Expand Down
4 changes: 2 additions & 2 deletions docs/source/examples_understand_africa_education.rst
Original file line number Diff line number Diff line change
Expand Up @@ -628,7 +628,7 @@ The same applies to the regions. Let's look at student age.

.. code-block:: python
africa["PAGE"].bar(
africa["PAGE"].barh(
method = "50%",
of = "pred_zmalocp",
max_cardinality = 50,
Expand All @@ -639,7 +639,7 @@ The same applies to the regions. Let's look at student age.
:okwarning:
:okexcept:
fig = africa["PAGE"].bar(
fig = africa["PAGE"].barh(
method = "50%",
of = "pred_zmalocp",
max_cardinality = 50,
Expand Down
21 changes: 20 additions & 1 deletion docs/source/performance_vertica.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ Query Profiler
QueryProfiler.previous
QueryProfiler.step
QueryProfiler.to_html
QueryProfiler.get_activity_time
QueryProfiler.get_qplan_explain

Query Profiler Interface
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -84,6 +86,21 @@ Query Profiler Interface
QueryProfilerInterface.set_position
QueryProfilerInterface.step
QueryProfilerInterface.to_html
QueryProfilerInterface.client_data_test
QueryProfilerInterface.clock_exec_time_test
QueryProfilerInterface.exec_time_test
QueryProfilerInterface.get_activity_time
QueryProfilerInterface.get_qplan_explain
QueryProfilerInterface.get_qsteps_
QueryProfilerInterface.get_resource_acquisition
QueryProfilerInterface.import_profile
QueryProfilerInterface.pool_queue_wait_time_test
QueryProfilerInterface.qsteps_clicked
QueryProfilerInterface.query_events_test
QueryProfilerInterface.segmentation_test
QueryProfilerInterface.update_cpu_time
QueryProfilerInterface.update_qsteps
QueryProfilerInterface.update_step

Query Profiler Comparison
^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand All @@ -102,4 +119,6 @@ Query Profiler Comparison
.. autosummary::
:toctree: api/

QueryProfilerComparison.get_qplan_tree
QueryProfilerComparison.get_qplan_tree
QueryProfilerComparison.sync_all_checkboxes
QueryProfilerComparison.unsync_all_checkboxes
2 changes: 1 addition & 1 deletion docs/source/user_guide_data_preparation_outliers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ Generally, you can identify global outliers with the ``Z-Score``. We'll consider
.. raw:: html
:file: SPHINX_DIRECTORY/figures/ug_dp_plot_outliers_5.html

Other techniques like :py:mod:`~verticapy.machine_learning.vertica.DBSCAN` or local outlier factor (``LOF``) can be to used to check other data points for outliers.
Other techniques like :py:mod:`~verticapy.machine_learning.vertica.cluster.DBSCAN` or local outlier factor (``LOF``) can be to used to check other data points for outliers.

.. code-block:: python
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide_full_stack_complex_data_vmap.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ In order to work with complex data types in VerticaPy, you'll need to complete t
import verticapy as vp
- Connect to Vertica. This example uses an existing connection called "VerticaDSN". For details on how to create a connection, see the :ref:`connection` tutorial.
- Connect to Vertica. This example uses an existing connection called ``VerticaDSN`` . For details on how to create a connection, see the :ref:`connection` tutorial.

.. note:: You can skip the below cell if you already have an established connection.

Expand Down
4 changes: 2 additions & 2 deletions docs/source/user_guide_full_stack_linear_regression.rst
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ We can use a cross-validation to test our model.
.. raw:: html
:file: SPHINX_DIRECTORY/figures/ug_fs_table_lr_9.html

The model isn't bad. We're just using a few variables to get a median absolute error of 47; that is, our score has a distance of 47 from the true value. This seems high, but if we keep in mind that the final score is over 1000, our predictions are quite good.
The model isn't bad. We're just using a few variables to get a median absolute error of ``47``; that is, our score has a distance of ``47`` from the true value. This seems high, but if we keep in mind that the final score is over ``1000``, our predictions are quite good.

Let's compare the importance of our features.

Expand Down Expand Up @@ -388,7 +388,7 @@ We see a high heteroscedasticity, indicating that we can't trust the ``p-value``
model.coef_
Let's look at the model's analysis of variance (ANOVA) table.
Let's look at the model's analysis of variance (``ANOVA``) table.

.. code-block:: ipython
Expand Down
14 changes: 7 additions & 7 deletions docs/source/user_guide_full_stack_to_json.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
.. _user_guide.full_stack.to_json:

=========================
Example: XGBoost.to_json
=========================
================
XGBoost.to_json
================

Connect to Vertica
--------------------
Expand Down Expand Up @@ -160,7 +160,7 @@ Evaluate the model with :py:func:`~verticapy.machine_learning.vertica.ensemble.X
.. raw:: html
:file: SPHINX_DIRECTORY/figures/ug_fs_to_json_report.html

Use to_json() to export the model to a JSON file. If you omit a filename, VerticaPy prints the model:
Use :py:func:`~verticapy.machine_learning.vertica.ensemble.XGBClassifier.to_json` to export the model to a JSON file. If you omit a filename, VerticaPy prints the model:

.. ipython:: python
Expand Down Expand Up @@ -194,7 +194,7 @@ This exported model can be used with the Python XGBoost API right away, and expo
result = result.sum() / len(result);
assert result == pytest.approx(0.0, abs = 1.0E-14)
For multiclass classifiers, the probabilities returned by the VerticaPy and the exported model may differ slightly because of normalization; while Vertica uses multinomial logistic regression, XGBoost Python uses Softmax. Again, this difference does not affect the model's final predictions. Categorical predictors must be encoded.
For multiclass classifiers, the probabilities returned by the VerticaPy and the exported model may differ slightly because of normalization; while Vertica uses multinomial logistic regression, ``XGBoost`` Python uses Softmax. Again, this difference does not affect the model's final predictions. Categorical predictors must be encoded.


Clean the Example Environment
Expand All @@ -211,8 +211,8 @@ Drop the ``xgb_to_json`` schema, using CASCADE to drop any database objects stor
Conclusion
-----------

VerticaPy lets you to create, train, evaluate, and export Vertica machine learning models. There are some notable nuances when importing a Vertica XGBoost model into Python XGBoost, but these do not affect the accuracy of the model or its predictions:
VerticaPy lets you to create, train, evaluate, and export Vertica machine learning models. There are some notable nuances when importing a Vertica ``XGBoost`` model into Python ``XGBoost``, but these do not affect the accuracy of the model or its predictions:

Some information computed during the training phase may not be stored (e.g. ``sum_hessian`` and ``loss_changes``).

The exact probabilities of multiclass classifiers in a Vertica model may differ from those in Python, but bot ``h`` will make the same predictions. Python XGBoost does not support categorical predictors, so you must encode them before training the model in VerticaPy.
The exact probabilities of multiclass classifiers in a Vertica model may differ from those in Python, but bot ``h`` will make the same predictions. Python ``XGBoost`` does not support categorical predictors, so you must encode them before training the model in VerticaPy.
2 changes: 1 addition & 1 deletion docs/source/user_guide_machine_learning_clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ While there aren't any real metrics for evaluating unsupervised models, metrics
print(model.get_vertica_attributes("metrics")["metrics"][0])
You can add the prediction to your vDataFrame.
You can add the prediction to your :py:mod:`~verticapy.vDataFrame`.

.. code-block::
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide_machine_learning_introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ When we have more than two categories, we use the expression ``Multiclass Classi
Unsupervised Learning
----------------------

These algorithms are to used to segment the data (:py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans`, :py:mod:`~verticapy.machine_learning.vertica.DBSCAN`, etc.) or to detect anomalies (:py:mod:`~verticapy.machine_learning.vertica.LocalOutlierFactor`, ``Z-Score`` Techniques...). In particular, they're useful for finding patterns in data without labels. For example, let's use a :py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans` algorithm to create different clusters on the Iris dataset. Each cluster will represent a flower's species.
These algorithms are to used to segment the data (:py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans`, :py:mod:`~verticapy.machine_learning.vertica.cluster.DBSCAN`, etc.) or to detect anomalies (:py:mod:`~verticapy.machine_learning.vertica.LocalOutlierFactor`, ``Z-Score`` Techniques...). In particular, they're useful for finding patterns in data without labels. For example, let's use a :py:mod:`~verticapy.machine_learning.vertica.cluster.KMeans` algorithm to create different clusters on the Iris dataset. Each cluster will represent a flower's species.

.. code-block:: python
Expand Down
8 changes: 3 additions & 5 deletions docs/source/user_guide_machine_learning_model_tracking.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,16 +103,14 @@ So far we have only added three models to the experiment, but we could add many
top_model = my_experiment_1.load_best_model(metric = "auc")
The experiment object facilitates not only model tracking but also makes cleanup super easy, especially in real-world
scenarios where there is often a large number of leftover models. The :py:func:`~verticapy.machine_learning.vertica.LogisticRegression.drop` method drops from the database the info of the experiment and all associated models other than those specified in the keeping_models list.
The experiment object facilitates not only model tracking but also makes cleanup super easy, especially in real-world scenarios where there is often a large number of leftover models. The :py:func:`~verticapy.machine_learning.vertica.LogisticRegression.drop` method drops from the database the info of the experiment and all associated models other than those specified in the keeping_models list.

.. ipython:: python
:okwarning:
my_experiment_1.drop(keeping_models=[top_model.model_name])
my_experiment_1.drop(keeping_models = [top_model.model_name])
Experiments are also helpful for performing grid search on hyper-parameters. The following example shows how they can
be used to study the impact of the max_iter parameter on the prediction performance of :py:mod:`~verticapy.machine_learning.vertica.linear_model.LogisticRegression` models.
Experiments are also helpful for performing grid search on hyper-parameters. The following example shows how they can be used to study the impact of the ``max_iter`` parameter on the prediction performance of :py:mod:`~verticapy.machine_learning.vertica.linear_model.LogisticRegression` models.

.. ipython:: python
:suppress:
Expand Down
4 changes: 2 additions & 2 deletions docs/source/user_guide_performance_qprof.rst
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ Once the :py:mod:`~verticapy.performance.vertica.qprof.QueryProfiler` object is
.. raw:: html
:file: SPHINX_DIRECTORY/figures/user_guides_performance_qprof_get_queries.html

To visualize the query plan, run :py:func:`verticapy.QueryProfilerInterface.get_qplan_tree`,
To visualize the query plan, run :py:func:`~verticapy.performance.vertica.qprof.QueryProfilerInterface.get_qplan_tree`,
which is customizable, allowing you to specify certain metrics or focus on a specified tree path:

.. image:: /_static/website/user_guides/performance/user_guide_performance_qprof_get_qplan_tree.png
Expand Down Expand Up @@ -277,7 +277,7 @@ You can export and import :py:mod:`~verticapy.performance.vertica.qprof.QueryPro
Export
+++++++

To export a :py:mod:`~verticapy.performance.vertica.qprof.QueryProfiler` object, use the :py:func:`~verticapy.performance.vertica.QueryProfiler.export_profile` method:
To export a :py:mod:`~verticapy.performance.vertica.qprof.QueryProfiler` object, use the :py:func:`~verticapy.performance.vertica.qprof.QueryProfiler.export_profile` method:

.. code-block:: python
Expand Down

0 comments on commit 295a9ed

Please sign in to comment.