Skip to content

Database

Brian Riley edited this page Jul 17, 2024 · 9 revisions

<- Home

This system uses an AWS DynamoDB table. DynamoDB is a NoSQL Document database that stores both Provenance metadata and DMP metadata as a JSON objects.

Sample Provenance item:

Provenance systems are external websites/systems that supply or alter DMP metadata.

The following JSON object represents a provenance system record. All provenance system records have a Partition Key (PK) that begins with the PROVENANCE# prefix and a Sort Key (SK) that is equal to PROFILE.

{
  "PK": "PROVENANCE#example",
  "SK": "PROFILE",
  "contact": {
    "email": "[email protected]",
    "name": "Example system administrator"
  },
  "description": "An external system",
  "downloadUri": "https://example.com/api/dmps/",
  "homepage": "https://example.com",
  "name": "Example System",
  "org_access_level": "restricted",
  "redirectUri": "https://example.com/callback",
  "seedingWithLiveDmpIds": true,
  "tokenUri": "https://example.com/oauth/token"
}

Explanation of Provenance item attributes:

  • PK - required A unique partition key for the external system. Note that it must start with PROVENANCE#
  • SK - required The sort key (do not change this)
  • contact - required The primary technical contact for the external system (displayed in the UI)
  • description - A description of the external system (displayed in the UI)
  • downloadUri - The endpoint that the DMPHub can use to download the DMP as a PDF (if applicable). The specific location of the PDF is embedded in the DMP's JSON as a dmproadmap_related_identifier with the "descriptor": "is_metadata_for" and "work_type": "output_management_plan". Note that the system will first check that the target of that related identifier matches the downloadUri defined here (e.g. https://example.org/dmps/download/). If it does not match, an error is raised. This prevents downloads from unknown/unverified locations
  • homepage - required The landing page for the external system (displayed in the UI)
  • name - required The name of the external system (displayed in the UI)
  • org_access_level - required The access level for the provenance system. "all": the system can query and mutate all DMPs, "restricted": the system can query and mutate any of the DMPs associated with the ROR ids listed in the corresponding SSM param /uc3/dmp/tool/provenance/[provenance-name]/ror_list, "public": the system may only query (NO mutations) public DMPs
  • redirectUri - The URI that the DMPHub can use to send updates about the DMP. For example if the DMPHub learns of a grant ID that was associated with the DMP, it will send that information back to external system via this URI.
  • seedingWithLiveDmpIds - Flag that can be used when seeding DMPs from an external system to the DMPHub this flag will use the provided DMP ID instead of minting a new one with EZID. Note that the DMP ID targets would need to be updated with the minting authority so that they point to the new DMPHub landing page. (default is false)
  • tokenUri - The endpoint the DMPHub should use to obtain an access token that can be used when calling the downloadUri and redirectUri (if applicable). Note that the tokenUri works in conjunction with 2 SSM parameters (note that the 'example' must match the PK value for the item!): /uc3/dmp/hub/dev/example/client_id, /uc3/dmp/hub/dev/example/client_secret

Sample DMP item (complete metadata):

The following JSON object represents a DMP item in the Dynamo table.

{
  "PK": "DMP#doi.org/10.12345/A1.1A2B3C4D6",
  "SK": "latest",
  "contact": {
    "contact_id": {
      "identifier": "https://orcid.org/0000-0000-0000-0000",
      "type": "orcid"
    },
    "dmproadmap_affiliation": {
      "affiliation_id": {
        "identifier": "https://ror.org/12344556",
        "type": "ror"
      },
      "name": "Example University"
    },
    "mbox": "[email protected]",
    "name": "Doe, Jane"
  },
  "contributor": [
    {
      "dmproadmap_affiliation": {
        "affiliation_id": {
          "identifier": "https://ror.org/12344556",
          "type": "ror"
        },
        "name": "Example University"
      },
      "contributor_id": {
        "identifier": "https://orcid.org/0000-0000-0000-0000",
        "type": "orcid"
      },
      "mbox": "[email protected]",
      "name": "Doe, Jane",
      "role": [
        "http://credit.niso.org/contributor-roles/data-curation",
        "http://credit.niso.org/contributor-roles/investigation"
      ]
    },
    {
      "dmproadmap_affiliation": {
        "affiliation_id": {
          "identifier": "https://ror.org/23864587935",
          "type": "ror"
        },
        "name": "Another University"
      },
      "mbox": "[email protected]",
      "name": "Else, Someone",
      "role": [
        "http://credit.niso.org/contributor-roles/project-administration"
      ]
    },
    {
      "name": "So PhD., So N.",
      "role": [
        "http://credit.niso.org/contributor-roles/investigation"
      ]
    }
  ],
  "cost": [
    {
      "currency_code": "USD",
      "title": "Preservation costs",
      "description": "The estimated costs for preserving our data for 20 years",
      "value": 10000
    }
  ],
  "created": "2021-11-08T19:06:04Z",
  "dataset": [
    {
      "dataset_id": {
        "identifier": "1550",
        "type": "other"
      },
      "data_quality_assurance": [
        "We will verify the quality of all data collected during this project through a third party."
      ],
      "description": "<p>A collection of radiographic images of coral.</p>",
      "distribution": [
        {
          "data_access": "open",
          "host": {
            "description": "The test data repository for oceanographic information.",
            "dmproadmap_host_id": {
              "identifier": "https://www.re3data.org/api/v1/repository/r3d0000000000000",
              "type": "url"
            },
            "title": "Generic Ocean Information Data Repository",
            "url": "http://example.org/repo"
          },
          "license": [
            {
              "license_ref": "https://spdx.org/licenses/CC-BY-4.0.json",
              "start_date": "2021-05-18T00:00:00Z"
            }
          ],
          "title": "Anticipated distribution of coral images"
        }
      ],
      "issued": "2026-05-18T00:00:00Z",
      "keyword": [
        "Earth and related environmental sciences",
        "Coral"
      ],
      "metadata": [
        {
          "description": "Example Core - a tests metadata standard",
          "metadata_standard_id": {
            "identifier": "https://rdamsc.bath.ac.uk/api2/2485ty247y7t9y429t4295t",
            "type": "url"
          }
        }
      ],
      "personal_data": "unknown",
      "preservation_statement": "The images will be depositied in a repository and made available until 2050",
      "security_and_privacy": [
        {
          "title": "Data security",
          "description": "We're going to encrypt this one."
        }
      ],
      "sensitive_data": "unknown",
      "technical_resource": [
        {
          "name": "Example University's thermal imaging camera 1234",
          "description": "A super powerful thermal imaging camera"
        }
      ],
      "title": "Images of brain coral time series",
      "type": "dataset"
    }
  ],
  "description": "<p>The example data management plan for the DMPHub.</p>",
  "dmphub_created_at": "2022-11-29T19:49:08+00:00",
  "dmphub_modification_day": "2022-11-29",
  "dmphub_modifications": [
    {
      "id": "ZYXW9876",
      "provenance": "datacite",
      "timestamp": "2023-07-27T15:08:32+07:00",
      "note": "data received from event data",
      "status": "pending",
      "dmproadmap_related_identifiers": [
        {
          "work_type": "dataset",
          "descriptor": "references",
          "type": "doi",
          "identifier": "https://dx.doi.org/77.6666/H5H5H5"
        },
        {
          "work_type": "paper",
          "descriptor": "is_cited_by",
          "type": "url",
          "identifier": "https://academic.site/papers/123"
        }
      ],
      "funding": {
        "name": "National Science Foundation",
        "funder_id": {
          "type": "ror",
          "identifier": "https://ror.org/021nxhr62"
        },
        "funding_status": "granted",
          "grant_id": {
          "identifier": "https://doi.org/11.1111/2019.22702-3",
          "type": "doi"
        }
      }
     }, 
     {
       "id": "ZYXW9878",
       "provenance": "datacite",
       "timestamp": "2023-07-27T15:08:52+07:00",
       "note": "data received from event data",
       "status": "accepted",
       "dmproadmap_related_identifiers": [
         {
           "work_type": "article",
           "descriptor": "is_cited_by",
           "type": "url",
           "identifier": "https://doi.org/22.33333/pubmed.1242345234"
         }
       ]
     }
  ],
  "dmphub_provenance_id": "PROVENANCE#example",
  "dmphub_provenance_identifier": "https://example.com/dmps/989898",
  "dmphub_updated_at": "2022-11-29T19:49:08+00:00",
  "dmproadmap_external_system_identifier": "989898",
  "dmproadmap_privacy": "public",
  "dmproadmap_related_identifiers": [
    {
      "descriptor": "describes",
      "identifier": "https://doi.org/10.21966/1.566666",
      "type": "doi",
      "work_type": "dataset"
    },
    {
      "descriptor": "references",
      "identifier": "https://doi.org/10.5281/zenodo.5719523",
      "type": "doi",
      "work_type": "article"
    },
    {
      "descriptor": "is_metadata_for",
      "identifier": "https://example.com/api/v2/dmps/989898.pdf",
      "type": "url",
      "work_type": "output_management_plan"
    }
  ],
  "dmp_id": {
    "identifier": "https://doi.org/10.12345/A1.1A2B3C4D6",
    "type": "doi"
  },
  "ethical_issues_description": "We will need to ensure that we anonymie our data",
  "ethical_issues_exist": "yes",
  "ethical_issues_report": "https://example.edu/privacy_policy",
  "language": "eng",
  "modified": "2022-11-14T22:18:18Z",
  "project": [
    {
      "description": "Our sample project for the DMPHub.",
      "end": "2024-11-29T19:48:57Z",
      "funding": [
        {
          "funder_id": {
            "identifier": "https://ror.org/0000000000",
            "type": "ror"
          },
          "funding_status": "granted",
          "name": "National Funding Institute",
          "grant_id": {
            "type": "other",
            "identifier": "34562356"
          }
        }
      ],
      "start": "2015-05-12T00:00:00Z",
      "title": "DMPHub example DMP project."
    }
  ],
  "title": "Example DMP record for the DMPHub."
}

Explanation of DMP item attributes that are used internally by the DMPHub and are relevant internally and NOT distributed in API callers. For a full explanation of the other DMP attributes, please see the API documentation in the wiki:

  • PK - required A unique partition key for the DMP which equates to it's DMP ID (DOI). Note that it must start with DMP#
  • SK - required The sort key which represents the DMP version. The most current version is always VERSION#latest and prior versions use a date time stamp in UTC (e.g. VERSION#2022-10-03T09:15:32+00:00)
  • "dmphub_created_at - The date time stamp (UTC) of when the original version of the DMP was added to the DMPHub. This value remains the same regardless of the version (e.g. 2022-10-03T09:15:32+00:00).
  • dmphub_modification_day - The date of the version (UTC) (e.g. 2022-11-29) which is used to facilitate querying and sorting.
  • dmphub_modifications - Related works and Grant information acquired by an external system (not the owner of the record or the system of provenance). These items are added onto the main DMP ID record when they are accepted after an admin or the owner approves/confirms the item in the DMPTool UI.
  • dmphub_provenance_id - The PK of the Provenance system that created the DMP (e.g. PROVENANCE#example)
  • dmphub_provenance_identifier - The Provenance system's internal identifier for the DMP. This is used in conjunction with the Provenance system's redirectUri to send updates to the Provenance system (e.g. https://example.com/dmps/989898 or 989898)
  • dmphub_updated_at - The date time stamp (UTC) that this version of the DMP was added to the DMPHub. This value is used to create the official version SK the next time an update is made to the DMP.

Sample DMP metadata amendments from another system

When DMPs are created, a dmphub_provenance_id is recorded in the the DMP JSON. This is used to define the system of provenance. When another system apends metadata to the DMP, it's provenance is recorded. These provenance markers help prevent systems from overwriting one another's changes and also help determine when a new version should be created.

{
  "dmp": {
    "PK": "DMP#doi.org/10.12345/A1.1A2B3C4D6",
    "SK": "latest",
    "dmphub_provenance_id": "PROVENANCE#example",
    "title": "Example complete DMP",
    "dmp_id": {
      "type": "doi",
      "identifier": "https://doi.org/10.12345/A1.1A2B3C4D5"
    },
    "project": [
      {
        "title": "Example research project",
        "funding": [
          {
            "name": "National Funding Organization",
            "funder_id": {
              "type": "fundref",
              "identifier": "http://dx.doi.org/10.13039/100005595"
            }
            "funding_status": "granted",
            "grant_id": {
              "type": "url",
              "identifier": "https://awards.example.fund/1213424"
            },
            "dmphub_provenance_id": "PROVENANCE#funder123"
          }
        ]
      }
    ]
  }
}
Clone this wiki locally