-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to not search feature correlation with all y target? #25
Comments
Hi @wanga10000 , I am not sure I completely follow your example. It looks like the "activated" points are third and seventh day which are win and lose (1, -1). But you said,
I think I would have tried the same thing as you with your target y being as you described. However, as I think you pointed out, there are indicators that are highly correlated by virtue of being close to 0 which is most of your data points. So, I am assume what you are looking for is a way to include a mask on "0"s after the indicators have been calculated and only use dcor on the -1, 1 values? That does sound like a good feature if I understand correctly. However, at the moment I am not sure I can get to it soon as I am very busy on another project. Perhaps if you could describe how you would architect the solution. I am thinking that you may want to include a fit parameter such as "mask" that is the same size as y that could be used to filter the observations prior to dcor. Also happy to merge a PR if you would like to implement yourself. |
Yes, exactly. If doing so, I think this tool would give more practicality to algo trading, which is really good.
That sounds like a feasible work. And you can make the mask input default to all 0 so it wouldn't affect the original usage. Happy to see you thought that this is not a bad idea. I'll look foward to this feature coming online :) |
Hi,
First of all, thanks for developing this tool, it's an excellent tool for feature selecting.
Not only for the algorithm but also for the integration and processing of all indicators.
Here's my situation,
So I got a strategy, and I want to search features correlating to win or lose.
That is, there's only a few of points in my y target that is "activated" instead of using n-point return or the other.
Therefore I tried to make y target like the following:
Assume there's a 10-day OHLCV, and the strategy activated at the third day and seventh day.
y = {0,0,1,0,0,0,-1,0,0,0} where 1 stands for win and -1 stands for loss.
It's probably not a reasonable way to do this.
Cause the tool print like 5-6 features whose correlation to targets over 0.9.
And I realized that those features the tool found only correlated to "activated" points instead of win or lose.
So I think it would be good if the algorithm can search the points that is "activated" and mask the other points.
Do you have any suggestion of implementing this kind of usage? Thanks!
The text was updated successfully, but these errors were encountered: