Add data provider #69

harryrackmil · 2025-01-10T00:51:16Z

Change

Add a framework for reporting data from arbitrary sources to the publisher agent

Testing

Ran this and a publisher agent together via docker-compose and confirmed prices looked reasonable and the publisher agent was able to parse and sign them

harryrackmil · 2025-01-10T00:56:30Z

apps/lib/data_provider/command.go

+
+func init() {
+	DataProviderCmd.Flags().StringP(ConfigFilePathFlag, "c", "", "the path of your config json file")
+	DataProviderCmd.Flags().StringP(WebsocketUrl, "w", "", "the websocket url to write updates to")


This could be configured in the config json if we wanted, but it felt like more of a run time configuration (I might want to test without a websocket url at first to just look at the prices)

What do you think of making this a bit more generic - something like -o where the output could eventually take the form of

ws interface

http interface

a file buffer interface

which could be denoted by ws(s)://, http(s)://, file://, s3:// etc

Only ws should need to be supported right now though

makes sense - added a Writer interface and a GetWriter function where we can do some branching in the future. Currently it will fail if the --output-address isn't either blank or prefixed with ws://

harryrackmil · 2025-01-10T00:57:39Z

apps/lib/data_provider/data_source.go

+}
+
+func buildDataSources(config DataProviderConfig) []dataSource {
+	// group by data source id to support batched feeds


overkill for our current sources (right now every dataSource is 1:1 with a value) but we might want to support batching of feeds in the future

harryrackmil · 2025-01-10T00:59:07Z

apps/lib/data_provider/data_source_registry.go

+
+func GetDataSourceBuilder(dataSourceId DataSourceId) func([]DataProviderSourceConfig) []dataSource {
+	switch dataSourceId {
+	case UniswapV2DataSourceId:


This switch statement is the only shared file that needs to change when we write a new integration

harryrackmil · 2025-01-10T01:03:04Z

apps/lib/data_provider/random.go

a simpler example than uniswap. Still using the scheduled data source concept

harryrackmil · 2025-01-10T01:03:49Z

apps/lib/data_provider/runner.go

+		go dataSource.Run(r.updatesCh)
+	}
+
+	r.writer.Run(r.updatesCh)


kick off all the data sources in goroutines and kick off a writer thread

harryrackmil · 2025-01-10T01:17:41Z

apps/lib/data_provider/scheduled_data_source.go

a helper to allow pulling on a regular cadence

harryrackmil · 2025-01-10T01:21:52Z

apps/lib/data_provider/writer.go

+			valueUpdate := ValueUpdate{
+				PublishTimestamp: update.Timestamp.UnixNano(),
+				ValueId:          update.ValueId,
+				Value:            fmt.Sprintf(`%.18f`, update.Value),


convert to string so we don't report very small prices using scientific notation

harryrackmil · 2025-01-10T01:23:02Z

apps/lib/data_provider/writer.go

+
+		w.logger.Debug().Msgf("Update: %s", string(wsMessageBytes))
+
+		if conn != nil {


a user may not configure any websocket url when developing - they can just run the app in verbose mode to see the updates logged

harryrackmil · 2025-01-10T01:25:13Z

sample.data-provider.config.json

not sure where we want this to live. Wanted it outside of the lib/data_provider directory since we're passing a config file to the docker container, not baking it into the docker container

akawalsky · 2025-01-10T15:06:35Z

Im not seeing an entry point and the docker compose isnt working locally. How have you been running this?

harryrackmil · 2025-01-11T01:13:33Z

apps/lib/data_provider/sources/random/factory.go

@@ -0,0 +1,31 @@
+package random


this factory file is basically all boilerplate - it looks identical to the uniswap_v2/factory.go

harryrackmil · 2025-01-11T01:16:31Z

apps/lib/data_provider/sources/random/factory.go

+}
+
+func (f *randomDataSourceFactory) GetSchema() (*gojsonschema.Schema, error) {
+	return utils.LoadSchema("resources/config_schema.json", resourcesFS)


GetSchema is really a function of the data source class, not a specific instance of the data source, so I'm making it a function on the factory object. Might be worth renaming the factory interface to DataSourceType or DataSourceClass or something since it's doing more than just building data sources now.

harryrackmil · 2025-01-11T01:18:42Z

apps/lib/data_provider/sources/scheduler.go

+func NewScheduler(
+	updateFrequency time.Duration,
+	getUpdate func() (types.DataSourceUpdateMap, error),
+	handleErr func(error),


rather than passing a logger object to the scheduler, we pass handleErr function. In practice I expect most sources will call this with the GetErrorLogHandler() function provided in this file which just logs at a configurable level. Might make more sense to just pass a logger to the scheduler instead.

harryrackmil · 2025-01-11T01:19:18Z

apps/lib/data_provider/sources/uniswap_v2/config.go

@@ -0,0 +1,12 @@
+package uniswap_v2


this file is just the config object, in case we wanted to autogenerate from json schema (or generate json schema from this)

Would it also be worth having a config_test to confirm that the config is deserialized as expected?

harryrackmil · 2025-01-11T01:20:42Z

apps/lib/data_provider/sources/uniswap_v2/data_source.go

@@ -0,0 +1,153 @@
+package uniswap_v2


the data_source file has minimal boilerplate - it's just the DataSource object, which only needs to implement RunDataSource, and a constructor for the DataSource object.

harryrackmil · 2025-01-11T01:21:42Z

apps/lib/data_provider/types/model.go

@@ -0,0 +1,51 @@
+package types


making types its own package so every other package is able to depend on it

harryrackmil · 2025-01-11T01:23:22Z

apps/lib/data_provider/sources/ethereum_utils.go

+	baseRetryDelay   = 500 * time.Millisecond
+)
+
+func GetEthereumContract(


pulled some of the reusable eth contract stuff out of the uniswap integrations since it's a lot of code and will probably be used in other integrations - might make sense to put it in the utils directory. Leaving in sources for now because it will only be used by data sources

harryrackmil · 2025-01-11T01:25:27Z

apps/scripts/generate_data_provider_import_file.py

@@ -0,0 +1,25 @@
+ import os


Generate a file which will explicitly force importing each individual data source package. Using python since it was a little less verbose than the go version of this script, but happy to convert to go if we'd like.

We sort the package names here so it should generate the same file consistently. We can add a github action which runs this script and checks to see if the result of this script is different than the currently committed imports file, and block merging if they disagree

Is there a world where this also adds to the json schema config at the top level to support oneOf all of the resources included using a schema pointer? If so, we could eliminate the awkwardness of having to load each config individually.

The schema pointer $ref concept doesn't work so nicely since the config schemas are spread across different embed.FS instances in different packages - we'd need to write some custom schema loader code to use pointers, meaning the schema file couldn't really be used directly since people would need to use our custom loader.

Instead, I'm using the codegen script to generate a big flat config schema file which contains all the individual config schemas. We would have needed to do some codegen for the pointer approach anyway in order to update each pointer when we add new sources - this way we have a single interpretable config schema which doesn't require any custom loading code.

akawalsky · 2025-01-13T16:58:12Z

apps/lib/data_provider/writer.go

+	}
+}
+
+func (w *WebsocketWriter) runWriteLoop(updateCh chan types.DataSourceUpdateMap) error {


We should prob build in reconnect logic for the case where the publisher agent is restarted

This is handled already I believe - if we ever get an error when writing we'll stop the loop and return an error. This function is called inside a for loop in the Run function above so it should wait 5 seconds and then reconnect.

akawalsky

This is looking awesome, just a few more comments/questions

This reverts commit 5ae8797.

harryrackmil · 2025-01-14T01:17:54Z

.github/workflows/ci.yml

@@ -49,6 +49,29 @@ jobs:
        env:
          TARGETPLATFORM: linux/amd64

+  data-provider-codegen-check:


run the codegen script and fail if there's any git diff. Tested this by making a small change in one of the individual config_schema.json, not running the codegen script, and confirming the github action fails.

harryrackmil · 2025-01-14T01:19:04Z

apps/lib/data_provider/resources/config_schema.json

@@ -0,0 +1,138 @@
+{


auto-generated by the update_shared_data_provider_code.py script

harryrackmil · 2025-01-14T01:20:12Z

apps/lib/data_provider/resources/config_schema.json.template

@@ -0,0 +1,38 @@
+{


template file for top level config_schema generation

harryrackmil · 2025-01-14T01:21:38Z

apps/lib/data_provider/sources/random/factory.go

+	"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/utils"
+)
+
+var RandomDataSourceId types.DataSourceId = types.DataSourceId(utils.GetCurrentDirName())


data source id should be the same as the directory name (makes config schema generation easier)

harryrackmil · 2025-01-14T01:28:32Z

apps/scripts/update_shared_data_provider_code.py

+    source_config_schemas = []
+    data_source_ids = []
+
+    conditional_source_config_schema_template_str = """


we'll use this if/then to make sure that we're checking the appropriate schema for this data source - if we just check that the config matches any defined schema we'll wind up being as permissive as the most permissive schema.

harryrackmil · 2025-01-14T01:29:17Z

apps/scripts/update_shared_data_provider_code.py

+    }
+
+    with open(CONFIG_SCHEMA_TEMPLATE_FILE, 'r') as template_file:
+        template_json = template_file.read()


after wrapping each config schema in an if/else and concatenating them all together, fill in the config_schema.json.template file

harryrackmil · 2025-01-14T01:31:03Z

apps/lib/data_provider/sources/random/config_test.go

+	assert.NoError(t, err)
+
+	configStr := `
+	{


note that we're not testing the top level config here (that's tested at the top level), we're specifically testing the json within the config tag, applying only to this source

harryrackmil · 2025-01-14T01:33:00Z

Im not seeing an entry point and the docker compose isnt working locally. How have you been running this?

I committed the changes I'd made locally to the docker-compose so now you can run it from docker-compose:

docker-compose up --build data-provider

… and running the data provider

akawalsky · 2025-01-16T15:22:22Z

apps/docs/data_provider.md

+
+To add a new source:
+1. Add a [package](../lib/data_provider/sources/random) in the [sources directory](../lib/data_provider/sources) with your data source's name
+1. Run `python3 ./apps/scripts/update_shared_data_provider_code/py` to generate some framework code so that the framework is aware of your new source.


Numbers here seem to have gotten messed up somehow

they're intentionally all 1 so that markdown interprets them as ordered numbers (might be easier to review the markdown preview for this file). This way we can reorder the steps or add/remove steps without needing to update every number

Oh interesting. Is that a common practice? Not sure I've seen that before

Harry Rackmil added 3 commits January 9, 2025 16:44

add polling uniswap and random number source

9f4ffab

add fake alchemy api key

3cbe5cf

update comment

9ec15ef

harryrackmil commented Jan 10, 2025

View reviewed changes

apps/lib/data_provider/scheduled_data_source.go Outdated

Copy link

Contributor Author

harryrackmil Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a helper to allow pulling on a regular cadence

cleanup

4109b19

harryrackmil commented Jan 10, 2025

View reviewed changes

remove sample configs

6427494

harryrackmil commented Jan 10, 2025

View reviewed changes

harryrackmil requested a review from akawalsky January 10, 2025 01:25

Harry Rackmil added 3 commits January 10, 2025 17:06

respond to comments

a43db35

rename init -> factory

da60e79

source packages have basically the same file names

1c0526d

harryrackmil commented Jan 11, 2025

View reviewed changes

akawalsky reviewed Jan 13, 2025

View reviewed changes

respond to comments

6d3b2fc

Harry Rackmil added 4 commits January 13, 2025 17:07

add github action to check codegen

98df4ed

bump python version

16f32c5

test codegen check

5ae8797

Revert "test codegen check"

ca42582

This reverts commit 5ae8797.

harryrackmil commented Jan 14, 2025

View reviewed changes

simplify docker compose

a0fc5d1

harryrackmil commented Jan 14, 2025

View reviewed changes

harryrackmil requested a review from akawalsky January 14, 2025 01:39

Harry Rackmil added 10 commits January 14, 2025 13:30

use references in config schema

428d118

include configs repo

ee266d0

fix tests

6da3817

include data source id in each config object

e1ace31

cleanup, enforce uniqueness of value ids at runtime

5363606

uniswap_v2 -> uniswapv2

9db1452

add docs for writing a new data source, configuring the data provider…

7ffa341

… and running the data provider

bump version

6b87f30

add to the apps readme

7821db2

add dataSource as required field in schema

243b8e8

akawalsky reviewed Jan 16, 2025

View reviewed changes


		w.logger.Debug().Msgf("Update: %s", string(wsMessageBytes))

		if conn != nil {

Add data provider #69

Are you sure you want to change the base?

Add data provider #69

Conversation

harryrackmil commented Jan 10, 2025 • edited Loading

Change

Testing

Choose a reason for hiding this comment

akawalsky Jan 13, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akawalsky commented Jan 10, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akawalsky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harryrackmil Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

harryrackmil commented Jan 14, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harryrackmil commented Jan 10, 2025 •

edited

Loading

akawalsky Jan 13, 2025 •

edited

Loading

harryrackmil Jan 14, 2025 •

edited

Loading