Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on forecasting method #6

Closed
jermaine1ronquillo opened this issue Sep 7, 2018 · 12 comments
Closed

Question on forecasting method #6

jermaine1ronquillo opened this issue Sep 7, 2018 · 12 comments

Comments

@jermaine1ronquillo
Copy link

I'm new to fuzzy logic and I'd like know why the predict method requires the test set of the data?
Another question is does the method predict only one-step ahead?

Regards

@petroniocandido
Copy link
Collaborator

It is not exactly the "test" data, but the necessary lags used to forecast. On a first order model (as pyFTS.models.chen.ConventionalFTS) all you need to forecast t+1 is the last lag, i. e., a list with the last value of the time series. If you are using a high order model (pyFTS.models.hofts.HighOrderFTS) you will need more past lags.

Not all methods are projected to work with multiple steps ahead forecasting, but on the FTS.predict method there is a parameter "steps" where you can indicate the forecasting horizon you need. In this case the method will feedback its outputs for each step ahead.

Don't hesitate to contact us if you have more questions!

Best regards.

@jermaine1ronquillo
Copy link
Author

Thank you for the response, although I'm still a bit confused. I will try pyFTS to forecast water quality parameters, our objectives are to forecast at least 3-day ahead of time and able to predict if the parameters will be above a certain threshold. Definitely I'll ask more questions regarding the use of pyFTS. Again thank you.

@petroniocandido
Copy link
Collaborator

Hi!

Check this out: https://colab.research.google.com/drive/1yeaYrgasByD12JI-nEIlE_buQft3YR3I

The minimal input length for the predict method is the order of the model! To forecast multiple steps ahead you just need to use the parameter steps_ahead, indicating how many steps to forecast.

About the water quality time series, is it seasonal? Mono or Multivariate? How looks its ACF?

@jermaine1ronquillo
Copy link
Author

Thank you for this, based on my initial exploration on the data the time series show small seasonality (using statsmodels)
Below are the graphs
image
image

@petroniocandido
Copy link
Collaborator

It is a public dataset?

@jermaine1ronquillo
Copy link
Author

Unfortunately it is not, this was recorded from our treatment facility. Are you interested with the data? Please provide me with your email address and maybe I might get an approval from my superiors.

Regards

@jermaine1ronquillo
Copy link
Author

Thank you for the colab link, I used it as reference to model my data (one step forecasting). The result is interesting, see the graph below
image

However, I tried to modify the one extreme event to 10 just to check how to model predicts and this is the result
image

and I tried another again modifying the data and below is the result
image

My question is does the model really perform this good or I'm doing something wrong?
How does the model predict my data almost exactly (again I'm new to this)?
Thanks in advance!

@petroniocandido
Copy link
Collaborator

petroniocandido commented Sep 12, 2018

Can you share your code for verification? I'm working on a pyFTS tutorial for solar forecasting and the results are very good (texts in portuguese), around 5% of error (MAPE): https://colab.research.google.com/drive/1xfonrM853rtWTsVet7oJsFO-OoHWRgk6

The quality of a FTS model depends on several factors:
a) Data quality (in general FTS models are very sensitive to outliers);
b) Method (different methods for different demands);
c) Transformations (pre and post processing operations)
d) Partitioning (few partitions will underfit the model, to much partitions will overfit the model);
e) Order (the minimal number of lags used by the model);
f) Lags indexes (which past lags produce better generalizations);
g) alpha_cut (the minimal membership grade considered on fuzzyfication step, it helps to reduce overfit by cutting useless rules)

The default values of the FTS methods generally fit well on data. But depending on you application domain it is necessary to perform a parameter fine tuning . This hyperparameter optimization can performed using a genetic algorithm (I like to use the DEAP library for evolutive optimization: https://github.com/DEAP/deap) or specific hyperparameter optimization library such as hyperopt (https://github.com/hyperopt/hyperopt).

I hope I have helped, but still in contact for any questions!

@jermaine1ronquillo
Copy link
Author

Thank for your time and effort!
Here is my code, basically I copied your example.

Regards

train=data_mod['2012':'2016']['Turbidity'].values
test=data_mod['2017']['Turbidity'].values
from pyFTS.partitioners import Grid
fig, ax = plt.subplots(nrows=1, ncols=1,figsize=[25,5])
fs = Grid.GridPartitioner(data=train, npart=35)
fs.plot(ax)
from pyFTS.models import hofts
model1 = hofts.HighOrderFTS(order=1, partitioner=fs)
model1.fit(train)
print(model1)
fig, ax = plt.subplots(nrows=1, ncols=1,figsize=[15,5])
forecasts = model1.predict(test)
ax.plot(test[80:100], label='test')
ax.plot(forecasts[80:100], label='forecast')
ax.legend()
test_mod=data_mod['2017']['Turbidity'].copy()
test_mod['2017-04-04']=10
fig, ax = plt.subplots(nrows=1, ncols=1,figsize=[15,5])
forecasts = model1.predict(test_mod)
ax.plot(test_mod.values[80:100], label='test')
ax.plot(forecasts[80:100], label='forecast')
ax.legend()
test_mod['2017-04-05']=10
fig, ax = plt.subplots(nrows=1, ncols=1,figsize=[15,5])
forecasts = model1.predict(test_mod)
ax.plot(test_mod.values[80:100], label='test')
ax.plot(forecasts[80:100], label='forecast')
ax.legend()

@petroniocandido
Copy link
Collaborator

Looks fine to me!

Try higher order models to improve the accuracy.

Best regards!

@ramdhan1989
Copy link

Can you share your code for verification? I'm working on a pyFTS tutorial for solar forecasting and the results are very good (texts in portuguese), around 5% of error (MAPE): https://colab.research.google.com/drive/1xfonrM853rtWTsVet7oJsFO-OoHWRgk6

The quality of a FTS model depends on several factors:
a) Data quality (in general FTS models are very sensitive to outliers);
b) Method (different methods for different demands);
c) Transformations (pre and post processing operations)
d) Partitioning (few partitions will underfit the model, to much partitions will overfit the model);
e) Order (the minimal number of lags used by the model);
f) Lags indexes (which past lags produce better generalizations);
g) alpha_cut (the minimal membership grade considered on fuzzyfication step, it helps to reduce overfit by cutting useless rules)

The default values of the FTS methods generally fit well on data. But depending on you application domain it is necessary to perform a parameter fine tuning . This hyperparameter optimization can performed using a genetic algorithm (I like to use the DEAP library for evolutive optimization: https://github.com/DEAP/deap) or specific hyperparameter optimization library such as hyperopt (https://github.com/hyperopt/hyperopt).

I hope I have helped, but still in contact for any questions!

did you have example code to use hyperparam technique for pyFTS package ?

@petroniocandido
Copy link
Collaborator

Hi @ramdhan1989 !

Please check the issue #30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants