Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parse JSON Notices Section #2669

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
105 changes: 104 additions & 1 deletion app/routes/docs.client.samples.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ and some samples from the FAQs section of the [gcn-kafka-python](https://github.

To contribute your own ideas, make a GitHub pull request to add it to [the Markdown source for this document](https://github.com/nasa-gcn/gcn.nasa.gov/blob/CodeSamples/app/routes/docs.client.samples.md), or [contact us](/contact).

## Parsing
## Parsing XML

Within your consumer loop, use the following functions to convert the
content of `message.value()` into other data types.
Expand Down Expand Up @@ -164,3 +164,106 @@ for message in consumer.consume(end[0].offset - start[0].offset, timeout=1):
continue
print(message.value())
```

## Parsing JSON

GCN Notices for new missions are typically distributed in JSON format. This guide explains how to programmatically read the JSON schema.

Start with subscribing to a Kafka topic and parsing the JSON data

```python
from gcn_kafka import Consumer
import json

# Connect as a Kafka consumer
consumer = Consumer(client_id='fill me in', # Replace with your client ID
client_secret='fill me in', # Replace with your client secret
config={"message.max.bytes": 204194304},
)

# Subscribe to Kafka topic
consumer.subscribe(['gcn.circulars'])
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

# Continuously consume and parse JSON data
for message in consumer.consume(timeout=1):
if message.error():
print(message.error())
continue

# Print the topic and message ID
print(f"topic={message.topic()}, offset={message.offset()}")

# Kafka message value as a Base64-encoded string
value = message.value()
```

Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved
## Decoding Embedded Data

The following code demonstrates how to process a Kafka message containing a `base64` encoded skymap, decode it, and save it as a `.fits` file. Python's built-in [`base64`](https://docs.python.org/3/library/base64.html#base64.b64encode) module provides the `b64decode` method to make this task simple. JSON serialization uses Unicode, can handle a wide range of characters, enabling reliable handling of encoded data.

```python
import base64

# Convert the Kafka message value to a string
value_str = value.decode("utf-8")

# Parse the JSON data
value_json = json.loads(value_str)

# Extract the Base64-encoded skymap
skymap_string = value_json["event"]["skymap"]

# Decode the Base64 string
decoded_bytes = base64.b64decode(skymap_string)

# Save the decoded data as a FITS file
with open("skymap.fits", "wb") as fitsFile:
fitsFile.write(decoded_bytes)
```

If you want to include a FITS file in a Notice, you add a property to your schema definition in the following format:

```python
{
type: 'string',
contentEncoding: 'base64',
contentMediaType: 'image/fits',
}
```

## Encoding Embedded Data

In your data production pipeline, you can use the encoding steps to convert your file to a bytestring. This guide demonstrates how to encode a file (e.g., skymap.fits) into a `base64` encoded string and send it to a Kafka producer.
Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

```python
from gcn_kafka import Producer
import base64

# Set Kafka Topic and Producer Configuration
TOPIC = "gcn.circulars"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like the right topic.

Vidushi-GitHub marked this conversation as resolved.
Show resolved Hide resolved

producer = Producer(client_id='fill me in', # Replace with your client ID
client_secret='fill me in', # Replace with your client secret
config={"message.max.bytes": 204194304},
)

data = {
# ..
}

# Encode the file content in base64
with open("skymap.fits", "rb") as file:
data["skymap"] = base64.b64encode(file.read())

# Convert the dictionary into bytes
data_string = str(data).encode()

producer.produce(
TOPIC,
data_string,
)

producer.flush()
```

See [non-JSON data](https://json-schema.org/understanding-json-schema/reference/non_json_data.html) for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more information about what?

Loading