The SDX-Deliver service is responsible for ensuring that all SDX outputs are delivered to ONS via NIFI. This is done by encrypting and storing data into a GCP Bucket. It then notifies NIFI of the data's location. Metadata is provided for routing and validation once decrypted downstream.
SDX-Deliver is flask application made up of five endpoints. As a request is made to the service, metadata
is extracted and the data is then stored within a google bucket. The metadata is used to
construct a PubSub message to: dap-topic
. This notifies DAP that a new submission is in the bucket.
SEFT submissions are already encrypted as they come through SDX and therefore require no additional encryption before being stored
Install pipenv:
$ pip install pipenv
Create a virtualenv and install dependencies
$ make build
Testing: Install all test requirements and run tests:
$ make test
Running:
ensure you have installed all requirements with above make build
command then:
$ make start
Once a submission has been successfully encrypted and stored in the Bucket. A message is published to the dap-topic
.
Message Structure Example:
dap_message: Message {
data: b'{"version": "1", "files": [{"name": "087bfc03-8698...'
ordering_key: ''
attributes: {
"gcs.bucket": "ons-sdx-sandbox-outputs",
"gcs.key": "dap|087bfc03-8698-4137-a3ac-7a596b9beb2b",
"tx_id": "087bfc03-8698-4137-a3ac-7a596b9beb2b"
}
}
Message Data field Example:
data : {
'version': '1',
'files': [{
'name': '4f1c130a-0681-442f-8195-b5fa6c57e469:ftp',
'sizeBytes': 121144,
'md5sum': 'be08e1e407c79507a17d1e6dcdada055'
}],
'sensitivity': 'High',
'sourceName': 'ons-sdx-sandbox',
'manifestCreated': '2021-06-16T07:47:45.481Z',
'description': '009 survey response for period 1704 sample unit 49900108249D',
'dataset': '009',
'schemaversion': '1',
'iterationL1': '1704'
}
All submissions are stored within: ons-sdx-{project_id}-outputs
in their respective folders. The file-path is
specified in attributes."gcs.key"
.
The gpg key used to encrypt JSON surveys: dap-public-gpg
is managed by Google Secret Manager. A single API call is
made on program startup and stored in ENCRYPTION_KEY
.
Allows Survey, SEFT and Collate to send data to be stored by deliver
-
POST /deliver/dap
- Stores JSON surveys destined for DAP -
POST /deliver/legacy
- Stores JSON surveys destined for Legacy downstream -
POST /deliver/feedback
- Stores JSON Feedback submissions -
POST /deliver/comments
- Stores zipped spreadsheet (.xls) of comments -
POST /deliver/seft
- Stores SEFT submissions
deliver runs within the kubernetes cluster and utilises a kubernetes service
.This assigns the service with an IP
address and DNS name exposing it to the other services.
Environment Variable | Description |
---|---|
PROJECT_ID | Name of project |
BUCKET_NAME | Name of the bucket: {project_id}-outputs |
BUCKET | Bucket client to GCP |
DAP_TOPIC_PATH | Name of the dap topic: dap-topic |
DAP_PUBLISHER | PubSub publisher client to GCP |
ENCRYPTION_KEY | Key to encrypt all data |
GPG | System GPG key import |
Copyright © 2016, Office for National Statistics (https://www.ons.gov.uk)
Released under MIT license, see LICENSE for details.