Update dataset requires updating the entire dataset #58

dylanmcreynolds · 2023-11-29T15:39:00Z

When doing an update I expected to be able to pass very minimal information to scicat. For instance, to change some metadata, my first try was calling client.update_dataset(Dataset(scientificMetadata= {"newkey":"newval"}, pid) . This fails because

it lacks required python fields so it fails pydantic validation
even after adding non-optional python members it fails backend validation due to some invalid defaults.

For updates is it expected that I first fetch the existing values, then submit an update with basically all fields filled? Do I need to manage "history" myself as well?

This is a limitation in the current pyscicat. The underlying method in the new backend is a patch method that has models for updates that make items optional. We should implement models for UpdateRawDatasetDto and UpdateDerivedDatasetDto and use those rather than the model for the dataset itself.

The text was updated successfully, but these errors were encountered:

sbliven · 2023-11-29T16:09:45Z

Thanks for the issue, @dylanmcreynolds.

Would you make the new methods take simple dicts as arguments (no model validation) or modify the models to make all fields optional?

Some other things I noticed experimenting with the API:

history should not be included in the PATCH. If omitted it gets filled automatically; if present it overrides previous history
Only the top-level keys are merged. Content-Type: application/merge-patch+json is not supported. Nested directories are overridden, not merged. So scientificMetadata always needs to be updated as a whole.
contactEmail is set to "" in GET requests from normal users (for GDPR reasons?). Thus updates should be careful to set this to None so it doesn't try to update it.

I should also mention I'm testing against backend v3. I'm not sure if the field validation changed in v4.

dylanmcreynolds · 2023-11-29T16:56:54Z

Would you make the new methods take simple dicts as arguments (no model validation) or modify the models to make all fields optional?

I think we should make models that mimic the models that the server enforces. Random dicts could pass client validation but then fail on the server. The way I read the server's openapi spec, the server can take PartialUpdateRawDatasetDto or PartialUpdateDerivedDatasetDto. These models list the allowed fields, but make them all optional.

dylanmcreynolds · 2023-11-29T16:59:02Z

I should also mention I'm testing against backend v3. I'm not sure if the field validation changed in v4.

Yeah, this is tricky, but your observations about history and such may not apply to v4. The models I referenced in my previous comment really may only apply to v4 backend, which is we would point this enhancement.

Update methods should really take partial models with all-optional contents (SciCatProject#58). Rather than fixing that issue, let's just ignore the typing error for now.

sbliven mentioned this issue Nov 29, 2023

Check typing with mypy #57

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dataset requires updating the entire dataset #58

Update dataset requires updating the entire dataset #58

dylanmcreynolds commented Nov 29, 2023

sbliven commented Nov 29, 2023

dylanmcreynolds commented Nov 29, 2023

dylanmcreynolds commented Nov 29, 2023 •

edited

Loading

Update dataset requires updating the entire dataset #58

Update dataset requires updating the entire dataset #58

Comments

dylanmcreynolds commented Nov 29, 2023

sbliven commented Nov 29, 2023

dylanmcreynolds commented Nov 29, 2023

dylanmcreynolds commented Nov 29, 2023 • edited Loading

dylanmcreynolds commented Nov 29, 2023 •

edited

Loading