Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dataset requires updating the entire dataset #58

Open
dylanmcreynolds opened this issue Nov 29, 2023 · 3 comments
Open

Update dataset requires updating the entire dataset #58

dylanmcreynolds opened this issue Nov 29, 2023 · 3 comments

Comments

@dylanmcreynolds
Copy link
Member

When doing an update I expected to be able to pass very minimal information to scicat. For instance, to change some metadata, my first try was calling client.update_dataset(Dataset(scientificMetadata= {"newkey":"newval"}, pid) . This fails because

  1. it lacks required python fields so it fails pydantic validation
  2. even after adding non-optional python members it fails backend validation due to some invalid defaults.

For updates is it expected that I first fetch the existing values, then submit an update with basically all fields filled? Do I need to manage "history" myself as well?

This is a limitation in the current pyscicat. The underlying method in the new backend is a patch method that has models for updates that make items optional. We should implement models for UpdateRawDatasetDto and UpdateDerivedDatasetDto and use those rather than the model for the dataset itself.

@sbliven
Copy link
Member

sbliven commented Nov 29, 2023

Thanks for the issue, @dylanmcreynolds.

Would you make the new methods take simple dicts as arguments (no model validation) or modify the models to make all fields optional?

Some other things I noticed experimenting with the API:

  • history should not be included in the PATCH. If omitted it gets filled automatically; if present it overrides previous history
  • Only the top-level keys are merged. Content-Type: application/merge-patch+json is not supported. Nested directories are overridden, not merged. So scientificMetadata always needs to be updated as a whole.
  • contactEmail is set to "" in GET requests from normal users (for GDPR reasons?). Thus updates should be careful to set this to None so it doesn't try to update it.

I should also mention I'm testing against backend v3. I'm not sure if the field validation changed in v4.

@dylanmcreynolds
Copy link
Member Author

Would you make the new methods take simple dicts as arguments (no model validation) or modify the models to make all fields optional?

I think we should make models that mimic the models that the server enforces. Random dicts could pass client validation but then fail on the server. The way I read the server's openapi spec, the server can take PartialUpdateRawDatasetDto or PartialUpdateDerivedDatasetDto. These models list the allowed fields, but make them all optional.

@dylanmcreynolds
Copy link
Member Author

dylanmcreynolds commented Nov 29, 2023

I should also mention I'm testing against backend v3. I'm not sure if the field validation changed in v4.

Yeah, this is tricky, but your observations about history and such may not apply to v4. The models I referenced in my previous comment really may only apply to v4 backend, which is we would point this enhancement.

sbliven added a commit to sbliven/pyscicat that referenced this issue Nov 30, 2023
Update methods should really take partial models with all-optional
contents (SciCatProject#58). Rather than fixing that issue, let's just ignore
the typing error for now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants