-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Programmatically query the available model results #359
Comments
>>> from smif.data_layer import Results
# pass arguments sufficient to create a Store behind the scenes, which we'll have read-only access to
>>> results = Results({'interface': 'local_csv', 'directory': path_to_project})
>>> results.list_model_runs()
['energy_central', 'energy_water_cp_cr']
# writing out this example for concreteness - there may be a more convenient data structure shape...
>>> results.available_results('energy_central')
{
'model_run': 'energy_central'
'sos_model': 'energy'
'models': [
{
'name': 'energy_demand'
'outputs': [
{
'name': 'cost',
'decision_timesteps': {
1: [2010],
2: [2010, 2015]
},
},
{
'name': 'water_demand',
'decision_timesteps': {
1: [2010],
2: [2010, 2015]
}
}
}
}
]
}
>>> da = results.read(modelruns=['energy_central'], timesteps=[2010, 2015, 2020], output_names=['water_demand'])
>>> da.as_df()
<returns correctly formatted pandas.DataFrame, including columns for timestep,modelrun,decision>
Later, maybe:
|
Also, I don't think that output names need to be unique across all models. We may need to specify the |
[thinking out loud] Do decision iterations/timesteps differ across outputs of a model, aside from there being 'no results'? |
Only if there's some partial output or failure - I'd expect all the outputs to be present for each |
Here's a straw man alternative - doesn't quite have all the information (i.e. doesn't tell you about those partial outputs), but is much more compact: >>> results.available_results('energy_central')
{
'model_run': 'energy_central'
'sos_model': 'energy'
'model_outputs': [
('energy_demand', 'cost'),
('energy_demand', 'water_demand')
'decision_timesteps': [
(1, 2010),
(2, 2010),
(3, 2015)
]
} |
I think that's better - we could always raise a warning to the user about partial results but use this compact form. |
The way we handled things is very close to Tom's suggestion, namely
The |
Hi @tlestang - this looks like an excellent first go at the problem and provides the functionality we need. I guess the next steps are to think about how we expose this to a user? I think Tom's suggestion for a read-only wrapper around the store seem sensible. We don't want to allow users to edit the results, delete files etc. accidentally while making plots! With this in mind, it would be worth taking a look at DataHandle. I would imagine that users may want to use an interactive Python environment for analysing results and writing scripts against the data to produce plots. So the |
I've been trying to modify our |
Hi @tlestang - good question. I think we can keep it simple to start: On the other hand, note that xarray has been a big influence on the design of |
That's great, I always wanted to have a look at xarray! I guess now is the time. |
Comments regarding 4b886ed
|
I'll summarise the discussion from slack. The main issue with having model_run as a dimension in the Spec is that there may well be different (timestep, decision)-pairs for different model runs. This means that there is no neat way to encode that information as a dimension: you would have to pad the output as necessary so that every (timestep, decision)-pair appears for each model run. Instead, we can easily present the data in the form of a dataframe with a column for model_run. To do this, the This change is made in ac88b36. |
In terms of future work:
Not quite sure what you mean here - do you mean literally re-ordering the columns, so that you have
This would be straightforward now, too - handling multiple outputs (presuming the specs are the same for each) would essentially add an additional column to the resulting dataframe. Units (presumably) could be added as an additional column per output. |
|
…ilability of quieried output as well as dimensionality Issue #359
@willu47 a quick update:
Could you give it another try and see if it's working for you? |
Hi @fcooper8472 - many thanks. Almost there!
|
Thanks for the feedback - the latest commit on the PR removes the units in column names. |
Closed by #367 |
Child issue of #350
Query the available model results across various levels in the hierarchy of:
Users should be able to fix one or more of the above levels and receive a multi-dimensional array of data that represents the unfixed data.
Suggest the following interface:
The text was updated successfully, but these errors were encountered: