Fix general observations fmuobs for dictionaries #648

daniel-sol · 2024-01-12T13:32:28Z

General observations are not written when choosing yaml option, this PR Fixes this
Fixes issue #351

Updated info 18/1 2024.
Some background info. The general observations are different to the observation types in that they are not directly contained in the observation file, but contained in separate files only referenced in the observation file. This is seemingly relatively straight forward, with these files being simple text files containing two columns (without any header). The first being the observation value, and the second being the value error, or uncertainty, in one file generally called something.obs. Here a general observation line contains one reference to a file, and this is then treated as one atomic observation with a lable, corresponding roughly to summary observation.

But there is not really any proper datastructure behind this. This is clearly seen in the case of rft data which deviates strongly from this pattern, here to get the complete picture of what is going on there are are three files that need to be accessed per observation:

The observations, contained in a file with value and error pairs, e.g. observation.obs
The spatial assignment to the observations, in a twin file called observations.txt i.e x y md tvd zone (But no headers)
The name of the well contained in a file generally called well_dates.txt, but could be called something else entirely, see the some_field rft observations contained in the somewhere/completely/different/rft_ERT_use_MDadjusted/ folder.

There is really no logic, all this is patched together in an gen_data job in the ert config file.
To make it worse general observations can be used for absolutely anything, but there is nothing hindring people from bunging everything into the same folder as the observation file, but the convention, but not requirement from ert's side is that these datatypes are group into folders, with sometimes intelligible names. This straw to clutch is in this PR used as the key, because without it one would be completely in the dark for what the general observation is.

Hope this was somewhat informative, please get in touch if you want a chat..

codecov-commenter · 2024-01-12T13:45:32Z

Codecov Report

Attention: 12 lines in your changes are missing coverage. Please review.

Comparison is base (6e675c1) 86.13% compared to head (d43c9d0) 86.28%.
Report is 4 commits behind head on main.

❗ Current head d43c9d0 differs from pull request most recent head 6a4c3a0. Consider uploading reports for the commit 6a4c3a0 to get more accurate results

Files	Patch %	Lines
src/subscript/fmuobs/gen_obs_writers.py	92.12%	10 Missing ⚠️
src/subscript/pack_sim/pack_sim.py	60.00%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #648      +/-   ##
==========================================
+ Coverage   86.13%   86.28%   +0.15%     
==========================================
  Files          50       51       +1     
  Lines        7069     7234     +165     
==========================================
+ Hits         6089     6242     +153     
- Misses        980      992      +12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mferrera · 2024-01-17T11:51:46Z

This PR adds a significant amount of text test data. It is a bit hard to tell because there are quite a few files, but are all of them needed?

daniel-sol · 2024-01-18T14:22:39Z

This PR adds a significant amount of text test data. It is a bit hard to tell because there are quite a few files, but are all of them needed?
Most of the files are needed, but I will remove some of them, e.g the files in folder welltest. They are not used in the testing. Regarding the other files I would rather keep them for some granularity, but just get in touch if it is unclear what they do..

for later integration testing

Pave the way for reading general observations

Make debugging easier

Pave way for correct reading of rft observations

Utilize defaults in dump_content_to_dict

Unused

alifbe

Hi @daniel-sol

Instead of assuming observation type based on folder name, we would recommend to probably let user specify it directly. It could be in the form of dictionary file that map observation key to observation type. Or perhaps a comment in ERT observation config. This will ensure that there is no guessing which might be prone to error.

The code can be extended to give warning when observation is not defined in the dictionary or vice versa when it exist in the dictionary but not in the ERT observation file.

daniel-sol · 2024-01-26T10:35:01Z

Hi @daniel-sol

Instead of assuming observation type based on folder name, we would recommend to probably let user specify it directly. It could be in the form of dictionary file that map observation key to observation type. Or perhaps a comment in ERT observation config. This will ensure that there is no guessing which might be prone to error.

The code can be extended to give warning when observation is not defined in the dictionary or vice versa when it exist in the dictionary but not in the ERT observation file.

I am a bit reluctant to do that. I do see your point, but one of my philosophies is to not add too much to the burden of the end user. This would be one of those things, what we could to is to introduce something an option to add something to your fmu-config if you want to fix it? But I don't want it to be a requirement.

See last commit. Where I have added the possibility of changing names from the fmu_config, and I have also added that if a user hasn't put the file in a folder it will end up with a key called unspecified, and with a strong warning..

Fix after deprecated types are removed

Missing ; made testdata incorrect

asnyv · 2024-01-30T12:00:13Z

Agree with @alifbe that the test data here seems way larger than needed. Most of it seems like with repeated lines of the same formatting, and/or several files of the same type? Has to be possible to cut these drastically?
In most cases it should be possible to test the same thing with 5 lines of test data as with 15000 lines.
The length of the data also makes it cumbersome to look at whether the setup seems reasonable.
Think we should also test this locally with as many different models as possible prior to merge to see if the current solution is robust.

Might be confusing for test users

alifbe · 2024-02-02T07:22:51Z

@daniel-sol could you please rebase your PR to main branch? I think the workflow is not triggered anymore since your PR went stale.

anders-kiaer · 2024-02-02T09:27:10Z

Will have a chat with @sondreso on the GENOBS topic in February. Suggestion from Daniel and myself is that we wait with merge + adding GENOBS support until after that meeting 🙂

anders-kiaer · 2024-03-20T10:35:51Z

FYI: Summary of yesterday's workshop (which included representatives from ERT, fmu-dataio, SUMO, webviz and subscript) on observations is summarized in #683 by @daniel-sol and myself.

anders-kiaer · 2024-05-02T11:04:08Z

This PR will be superseded by a new PR currently in the making.

daniel-sol force-pushed the fix_general_observations_fmuobs branch from 0cf4cf3 to 4c777e8 Compare January 12, 2024 14:42

daniel-sol force-pushed the fix_general_observations_fmuobs branch from 993e3b5 to cf974dc Compare January 17, 2024 14:29

daniel-sol added 25 commits January 19, 2024 10:51

Add full suite of drogon test data

a503cbc

Make one drogon obs file

f056de5

for later integration testing

Add test data specifically for general observations

11d1e60

Introduce parent_dir in function

17c18e7

Pave the way for reading general observations

Add log statement if no includes

a88a034

Add reader for general observation files

165577b

Introduce parent_dir in function

9bb9646

Introduce reading of general observations

2e42d6f

Align test with drogon changes

378ebf6

Add assert messages

721a3fe

Make debugging easier

Focus tests around general_df2obsdict

2abd96d

Add drogon_tracer for general_df2obsdict

36f6790

Modify test data to allow for general observations

2e0db6d

Add more keys for general observations

e37ebaf

Remove writing of test data

bbe8867

Rename key

be4fc35

Broaden test of general_df2obsdict

cac2a8d

Add functionality for reading twin file

39f0854

Pave way for correct reading of rft observations

Fix only one key in results

b721887

Utilize defaults in dump_content_to_dict

Add more specific test data for general_df2obsdict

841ab1f

Rename readers to gen_obs_writers

e285ca4

Move assignment of extra keys to right level

992c6de

Tidy test data

77e9a02

Give tests more informative name

d677c0b

Add first rough sketch of adding well name and date for rft

67a333f

daniel-sol added 10 commits January 19, 2024 10:51

Ignore type error

4086150

Remove unused imports

a145cb4

correct wrong assignment by swapping md and z

298c45b

Rename observations to value

a8f1cff

Align with existing format for summary

af6c6f7

Adapt test data to breaking changes

cdd6a67

Correct grammar

33fc2a2

Remove dump of temp file

655c675

Align tests with latest

f87dca9

Use more neutral name

6cf15a0

daniel-sol force-pushed the fix_general_observations_fmuobs branch from cf974dc to 6cf15a0 Compare January 19, 2024 10:10

daniel-sol added 2 commits January 19, 2024 11:12

Remove welltest test data

c2a0792

Unused

Convert relative path to absolute path

493316b

alifbe requested changes Jan 26, 2024

View reviewed changes

daniel-sol added 2 commits January 26, 2024 15:13

Add option for renaming

48e956e

Ensure tests runs

c04d1db

Fix after deprecated types are removed

daniel-sol requested a review from alifbe January 26, 2024 14:49

daniel-sol added 3 commits January 29, 2024 08:59

Isort

6a4c3a0

Fix test data

f791d28

Missing ; made testdata incorrect

Black

25f7d99

daniel-sol added 2 commits January 30, 2024 15:32

Tidy test data and test

e0b0467

Convert set into list

25718a9

Might be confusing for test users

alifbe requested a review from rnyb February 2, 2024 09:14

anders-kiaer closed this May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix general observations fmuobs for dictionaries #648

Fix general observations fmuobs for dictionaries #648

daniel-sol commented Jan 12, 2024 •

edited

Loading

codecov-commenter commented Jan 12, 2024 •

edited

Loading

mferrera commented Jan 17, 2024

daniel-sol commented Jan 18, 2024

alifbe left a comment

daniel-sol commented Jan 26, 2024 •

edited

Loading

asnyv commented Jan 30, 2024

alifbe commented Feb 2, 2024 •

edited

Loading

anders-kiaer commented Feb 2, 2024

anders-kiaer commented Mar 20, 2024 •

edited

Loading

anders-kiaer commented May 2, 2024

Fix general observations fmuobs for dictionaries #648

Fix general observations fmuobs for dictionaries #648

Conversation

daniel-sol commented Jan 12, 2024 • edited Loading

codecov-commenter commented Jan 12, 2024 • edited Loading

Codecov Report

mferrera commented Jan 17, 2024

daniel-sol commented Jan 18, 2024

alifbe left a comment

Choose a reason for hiding this comment

daniel-sol commented Jan 26, 2024 • edited Loading

asnyv commented Jan 30, 2024

alifbe commented Feb 2, 2024 • edited Loading

anders-kiaer commented Feb 2, 2024

anders-kiaer commented Mar 20, 2024 • edited Loading

anders-kiaer commented May 2, 2024

daniel-sol commented Jan 12, 2024 •

edited

Loading

codecov-commenter commented Jan 12, 2024 •

edited

Loading

daniel-sol commented Jan 26, 2024 •

edited

Loading

alifbe commented Feb 2, 2024 •

edited

Loading

anders-kiaer commented Mar 20, 2024 •

edited

Loading