Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KCA assignment pull request (gsmadi) #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gsmadi
Copy link

@gsmadi gsmadi commented Oct 23, 2021

No description provided.

@PanPip PanPip self-requested a review October 25, 2021 11:54
Copy link
Contributor

@PanPip PanPip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Gabriel, 🙂

This is a good submission. I can see that you went an extra step and decided to add unit tests for your code and used a linter. The strategy you chose is simple. Evaluating it based on a hit-or-miss ratio is an unusual approach, a data science-inspired one.

Will set up an interview.

  • Good to see a "Getting started" part and linting used.
  • Function/class structure is okay but could be further improved (commenting, using user-input values, avoiding loops).
  • Would love to see more analysis of the KCA, alternative strategy ideas.
  • Would be interesting to discuss the topic of data insufficiency here.
  • With extra time put in, the strategy can be packaged in a function/class structure. And later assessed based on the generated equity curve.
  • Extra points for adding unit tests.

gsmadi/README.md Show resolved Hide resolved

## KCA Trading Algorithm Design

We fit our KCA trading algorithm with 360 days worth of data. We select a year worth of data given thats the resolution we have plus it captures four quarters worth of price movements.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About 252 days of data can be used here, as a rough number of trading days per year.

gsmadi/src/plotting.py Show resolved Hide resolved
gsmadi/src/trading.py Show resolved Hide resolved
slot of the tuple and a 1 or -1 on the second slot to indicate
a buy and sell signal, respectively.
"""
randq = random.randrange(1000, 5000) # Generate random seed for KCA q seed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter can be user-input, or be calculated based on price_df, as it depends on the range of time series values, right?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was just a scalar to multiply against the time series. Agreed this could potentially be user input.

Q = q * np.eye(A.shape[0]) - Comment from paper q: Scalar that multiplies the seed states covariance

Comment on lines +311 to +312
"\n",
"The second anomaly can be seen a bit after year 2018. For now, we lack an explanation for such a deviation."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to explore what caused this performance in 2018.

gsmadi/KCA.ipynb Show resolved Hide resolved
"source": [
"Now that we have produced predictions and we know the actual values from our test set samples, lets see how well KCA fares. To see how well KCA performs we essentially see if it was right in direction in regard to the price movement and by how much.\n",
"\n",
"In the `prediction_delta` column we take the difference from the actual to the predicted value. Using the sign of this value and of the decision we create the `outcome` column. In this column, if the direction produced by KCA was correct we set a `1`, else a `0` for wrong.\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a hit-to-miss ratio is quite unusual to test a strategy performance, but it does, in general, tell us if the prediction rate is good. It would be nice to also see some adjustments to the simple strategy to see what performance can be obtained, maybe base it on velocity and/or acceleration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agreed on both the ratio being unusual and trying out other strategies.

For hit and miss I think I just wanted a simple way to convey prediction rate as vetting the strategy with actual trading introduces other variants (perhaps position sizing, unwinding held positions, etc).

In terms of other strategies, I think an interesting one would be to see if we can use velocity or acceleration as perhaps leading indicators to position spikes.

"\n",
"In the `prediction_delta` column we take the difference from the actual to the predicted value. Using the sign of this value and of the decision we create the `outcome` column. In this column, if the direction produced by KCA was correct we set a `1`, else a `0` for wrong.\n",
"\n",
"Computing a hit-to-miss ratio below, we see its not the greatest. Essentially, predicting 10 days worth of price movements, it only got 1 right. Now, lets highlight how little data we have to make conclusions on this."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Now, lets highlight how little data we have to make conclusions on this."

How much data do you think would be sufficient to make conclusions regarding the performance of such a model?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we need to perform at least 10 experiments (for the same price path) and generate north of 100 predictions per experiment (~1000 predictions). Then that should perhaps suffice to obtain a sample mean and standard deviation for say things like prediction rate (hit-miss).

gsmadi/KCA.ipynb Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants