Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending the list of standard response formats (i.e. CIF, POSCAR) #343

Closed
merkys opened this issue Dec 9, 2020 · 5 comments
Closed

Extending the list of standard response formats (i.e. CIF, POSCAR) #343

merkys opened this issue Dec 9, 2020 · 5 comments
Labels
status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. topic/response-format Issue discussing changes and improvements to the API response format type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.

Comments

@merkys
Copy link
Member

merkys commented Dec 9, 2020

There have been discussions about exposing CIF data (#329) or any entry-related files (#211) through OPTIMADE. Now I remembered the provision of endpoint-specific response_format which could be used to ask for specific representations of entries or their collections. Up to now OPTIMADE only supports json response format. I suggest standardizing other optional response formats that could be used by clients to retrieve representations of entries, and my immediate suggestions would be CIF (cif, native structure data format for the COD) and POSCAR (poscar). Upon request for response_format=cif client would receive plain CIF file for an entry (or several entries) and could directly use it in calculations etc.

It may also be true that I misunderstood the intention of response_format. Maybe it is supposed to specify a carrier format for OPTIMADE data. Then foreseen additions to existing json would be xml and yaml, which would carry the same property information wrapped in their specific syntax. If this is the case, then #211 is worth pursuing to provide other representations of entries instead.

@rartino
Copy link
Contributor

rartino commented Dec 9, 2020

I think this is a very nice idea, and very much in line with what I imagine response_format is useful for.

Edit: upon re-reading what you wrote after having formulated by reply, I realize that you are basically asking whether response_format is meant to be what I describe below (= "a transport format for OPTIMADE data"), or if it is okay to 'just return a cif file related to the entry'. I'd say that the intent for response_format is to be official representations of OPTIMADE data, but I don't see why not cif and even POSCAR couldn't be that, if standardized properly.

To actually include response_format=cif in the standard requires a bit of work. IMO we must specify exactly how all standard OPTIMADE entries are mapped into the cif format, including things like relationships between entries (although, for anything optional it could say output_format=cif does not support it and it MUST be omitted for now). This to ensure that every database returns consistent cif output. You may say "cif is already a standard" and obviously the mapping should adhere completely to that standard. But there are definitely things in the OPTIMADE -> cif mapping that isn't obvious, and it would be a shame to end up with 43 dialects of cif output all using their own representation of, e.g., assemblies.

May I suggest that you start with creating a response_format=_cod_cif to work out the details of this mapping. I'd be more than happy to discuss ideas about how to best do these various mappings, but the decisions would be all yours since this is under your namespace.

Then, at some future point, you'll come to OPTIMADE with your design and propose to adopt it as the standard for response_format=cif. Perhaps then we are happy to adopt it exactly as designed, or we will see a need to consider some adjustments to meet requirements of others in the consortium. But we can take that discussion at that point.

@gmrigna
Copy link
Contributor

gmrigna commented Dec 10, 2020

I also think that it is a very nice idea... It might be worth looking at some previous work by @merkys (https://github.com/cod-developers/CIF2JSON)

@merkys
Copy link
Member Author

merkys commented Dec 11, 2020

I am mostly interested in standardizing OPTIMADE methods to access other resource representations, such as CIF and POSCAR for structures, BibTeX for references and so on. I am not much interested in expressing OPTIMADE-specific details (i.e., relationships and so on) in these representations, because OPTIMADE already has means for querying and retrieving this information.

Standardizing CIF is a tedious task on its own. COD contains CIF files from more than 43 sources, and we leave the standardization (more like interpretation) to the client, because standardization usually means making assumptions about data, and care should be taken not to overinterpret or invent any data. Of course, conservative inferences and standardizations may be applied, but I would better leave them to the client. There are a lot of tools preparing CIF files for inputs to QM and other calculations. I would prefer implementations providing CIF files in any way that is compatible with already existing CIF standard.

CIF2JSON would maybe be useful for representing CIF in JSON (something like response_format=cif+json), if there was a need for that as well.

@merkys merkys added status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. topic/response-format Issue discussing changes and improvements to the API response format type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus. labels Dec 17, 2020
@merkys
Copy link
Member Author

merkys commented Jun 7, 2021

Revisiting this discussion and re-reading @rartino's comment I am more and more inclined to give this proposal up in favor of #211.

@merkys merkys mentioned this issue Jun 8, 2021
@merkys
Copy link
Member Author

merkys commented Jun 3, 2022

As said before, I have conflated alternative OPTIMADE response formats and file representations of OPTIMADE entries into a single issue. Now as #360 is merged, I am closing this issue.

@merkys merkys closed this as completed Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. topic/response-format Issue discussing changes and improvements to the API response format type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.
Projects
None yet
Development

No branches or pull requests

3 participants