Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add numba version of sma #229

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

# Technical Analysis Library in Python

It is a Technical Analysis library useful to do feature engineering from financial time series datasets (Open, Close, High, Low, Volume). It is built on Pandas and Numpy.
It is a Technical Analysis library useful to do feature engineering from financial time series datasets (Open, Close, High, Low, Volume). It is built on Pandas, Numpy and Numba.

![Bollinger Bands graph example](static/figure.png)

Expand Down Expand Up @@ -176,6 +176,8 @@ Thank you to [OpenSistemas](https://opensistemas.com)! It is because of your con

* https://en.wikipedia.org/wiki/Technical_analysis
* https://pandas.pydata.org
* https://numpy.org/
* https://numba.pydata.org
* https://github.com/FreddieWitherden/ta
* https://github.com/femtotrader/pandas_talib

Expand Down
6 changes: 4 additions & 2 deletions requirements-core.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
numpy==1.17.4
pandas==0.25.3
llvmlite==0.33.0; python_version >= "3.6"
numba==0.50.1; python_version >= "3.6"
numpy==1.17.4; python_version >= "3.6" and python_full_version >= "3.5.3"
pandas==0.25.3; python_full_version >= "3.5.3"
95 changes: 92 additions & 3 deletions ta/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@

import numpy as np
import pandas as pd
from numba import guvectorize


class IndicatorMixin:
Expand Down Expand Up @@ -56,9 +57,97 @@ def dropna(df: pd.DataFrame) -> pd.DataFrame:
return df


def _sma(series, periods: int, fillna: bool = False):
min_periods = 0 if fillna else periods
return series.rolling(window=periods, min_periods=min_periods).mean()
@guvectorize(
["void(float64[:], intp[:], float64[:])"],
"(n),()->(n)",
nopython=True,
target="cpu",
)
def numba_sma(arr, window_arr, out):
"""Function to calculate the simple moving average using guvectorize() to create a Numpy ufunc.

https://numba.readthedocs.io/en/stable/user/vectorize.html?highlight=guvectorize#the-guvectorize-decorator

https://numba.readthedocs.io/en/stable/user/examples.html?highlight=moving%20average#moving-average

Modified the numba example so it fills the window size with np.nan

Args:
arr (np.array): Numpy array
window_arr (np.array): Numpy array or simply a int
Returns:
guvectorize() functions don’t return their result value: they take it as an array argument, which must be
filled in by the function. This is because the array is actually allocated by NumPy’s dispatch mechanism,
which calls into the Numba-generated code. (From the numba docs)
"""
window_width = window_arr[0]
asum = 0.0
count = 0
for i in range(window_width):
asum += arr[i]
count += 1
out[i] = np.nan
out[window_width - 1] = asum / count
for i in range(window_width, len(arr)):
asum += arr[i] - arr[i - window_width]
out[i] = asum / count


@guvectorize(
["void(float64[:], intp[:], float64[:])"],
"(n),()->(n)",
nopython=True,
target="cpu",
)
def numba_sma_fillna(arr, window_arr, out):
"""Function to calculate the simple moving average and filling NaN using guvectorize() to create a Numpy ufunc.

https://numba.readthedocs.io/en/stable/user/vectorize.html?highlight=guvectorize#the-guvectorize-decorator

https://numba.readthedocs.io/en/stable/user/examples.html?highlight=moving%20average#moving-average

Slightly modified the numba sma example

Args:
arr (np.array): Numpy array
window_arr (np.array): Numpy array or int
Returns:
guvectorize() functions don’t return their result value: they take it as an array argument, which must be
filled in by the function. This is because the array is actually allocated by NumPy’s dispatch mechanism,
which calls into the Numba-generated code. (From the numba docs)
"""
window_width = window_arr[0]
asum = 0.0
count = 0
for i in range(window_width):
asum += arr[i]
count += 1
out[i] = asum / count
for i in range(window_width, len(arr)):
asum += arr[i] - arr[i - window_width]
out[i] = asum / count


def _sma(series: pd.Series, periods: int, fillna: bool = False) -> pd.Series:
# the @guvectorize decorator does not work with pylint
# pylint: disable=locally-disabled, useless-suppression, no-value-for-parameter
"""Helperfunction to use the the fill na functionality, using the two numba guvectorized functions

Args:
series (pd.Series): Panda Series.
periods (int): Window for the simple moving average.
fillna (bool): If True, fill nan values (default is False).

Returns:
pandas.Series: New feature generated.
"""
series_np_arr = series.to_numpy()
sma_np_arr = (
numba_sma_fillna(series_np_arr, periods)
if fillna
else numba_sma(series_np_arr, periods)
)
return pd.Series(sma_np_arr, index=series.index)


def _ema(series, periods, fillna=False):
Expand Down