Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data provider #69

Merged
merged 55 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
9f4ffab
add polling uniswap and random number source
Jan 10, 2025
3cbe5cf
add fake alchemy api key
Jan 10, 2025
9ec15ef
update comment
Jan 10, 2025
4109b19
cleanup
Jan 10, 2025
6427494
remove sample configs
Jan 10, 2025
a43db35
respond to comments
Jan 11, 2025
da60e79
rename init -> factory
Jan 11, 2025
1c0526d
source packages have basically the same file names
Jan 11, 2025
6d3b2fc
respond to comments
Jan 14, 2025
98df4ed
add github action to check codegen
Jan 14, 2025
16f32c5
bump python version
Jan 14, 2025
5ae8797
test codegen check
Jan 14, 2025
ca42582
Revert "test codegen check"
Jan 14, 2025
a0fc5d1
simplify docker compose
Jan 14, 2025
428d118
use references in config schema
Jan 14, 2025
ee266d0
include configs repo
Jan 14, 2025
6da3817
fix tests
Jan 14, 2025
e1ace31
include data source id in each config object
Jan 14, 2025
5363606
cleanup, enforce uniqueness of value ids at runtime
Jan 15, 2025
9db1452
uniswap_v2 -> uniswapv2
Jan 15, 2025
7ffa341
add docs for writing a new data source, configuring the data provider…
Jan 15, 2025
6b87f30
bump version
Jan 15, 2025
7821db2
add to the apps readme
Jan 15, 2025
243b8e8
add dataSource as required field in schema
Jan 16, 2025
40e0753
fix valid output url
akawalsky Jan 21, 2025
bb45b4e
CLI Codegen for Data Provider Sources
ACK101101 Jan 22, 2025
2e69c5e
refactor data provider code generation
ACK101101 Jan 22, 2025
9b4677e
added updated command and animation command
ACK101101 Jan 22, 2025
e2a32bd
set animation to run at startup
ACK101101 Jan 22, 2025
a80b409
ci fix? make command to install cli and animation only for certain co…
ACK101101 Jan 23, 2025
867a8fa
move cli commands before rust to prevent github CI error
ACK101101 Jan 23, 2025
bdc9f22
reverted makefile change
ACK101101 Jan 23, 2025
76eb6ce
ran update
ACK101101 Jan 23, 2025
3d89e56
updated autogen comment, fixed pascal to camel conversion for acronym…
ACK101101 Jan 24, 2025
c3e8fe7
removed animation code and frames, stored in branch alexander/stork-d…
ACK101101 Jan 27, 2025
a6a08bb
fixed capitalization in data_source template
ACK101101 Jan 27, 2025
e66472b
Add RaydiumCLMM as Data Source Using CLI Tool Via Helius
ACK101101 Jan 28, 2025
45c3808
cleanup commented out lines
ACK101101 Jan 28, 2025
e2ec424
Merge pull request #82 from Stork-Oracle/alexander/sto-669-add-raydiu…
ACK101101 Jan 29, 2025
b59433c
wrapped start cli command with new make command
ACK101101 Jan 29, 2025
03ba118
Merge pull request #78 from Stork-Oracle/alexander/sto-646-codegen-fo…
ACK101101 Jan 29, 2025
810ac35
QOL Improvements to Data Provider
ACK101101 Jan 29, 2025
c23be06
separate generate from data_provider
akawalsky Jan 29, 2025
9ad0f41
fix generate command
akawalsky Jan 29, 2025
1639c40
remove unused flag
akawalsky Jan 29, 2025
a2f70db
added animation, fixed make targets, and ci paths
ACK101101 Jan 29, 2025
5220784
another fix for ci
ACK101101 Jan 29, 2025
d48814a
merge with sto-691-qol-improvements-to-data-provider
ACK101101 Jan 29, 2025
dedb935
comment out remove for now
ACK101101 Jan 29, 2025
e105b0f
Merge pull request #85 from Stork-Oracle/separate-generate
ACK101101 Jan 29, 2025
155d315
use different provider urls so api keys are no longer required, added…
ACK101101 Jan 30, 2025
dd2db49
fixed config_test and make ci names consistent
ACK101101 Jan 30, 2025
1c45644
fixed typo and shortened readme
ACK101101 Jan 30, 2025
77a2473
Merge pull request #84 from Stork-Oracle/alexander/sto-691-qol-improv…
ACK101101 Jan 30, 2025
9731058
Add data provider integration test (#79)
harryrackmil Jan 31, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,29 @@ jobs:
env:
TARGETPLATFORM: linux/amd64

data-provider-codegen-check:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run the codegen script and fail if there's any git diff. Tested this by making a small change in one of the individual config_schema.json, not running the codegen script, and confirming the github action fails.

runs-on: ubuntu-latest

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: 3.10.16

- name: Run code generation script
run: |
python ./apps/scripts/update_shared_data_provider_code.py

- name: Check for changes
run: |
if [[ $(git status --porcelain) ]]; then
echo "Generated code is out of sync. Please run the script and commit the changes."
exit 1
fi

test-evm:
runs-on: ubuntu-latest

Expand Down
10 changes: 10 additions & 0 deletions apps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,13 @@ The Stork Network receives signed data feeds from publishers and aggregates them
The easiest way to become a Stork Publisher is to run the Stork Publisher Agent docker container on your infrastructure and send price updates to the Agent through a local websocket. The Stork Publisher Agent will sign your price updates with your private key and send them to the Stork Network.

See [Stork Publisher Agent Docs](docs/publisher_agent.md).

## Data Provider

To publish data into the Stork Network, a Publisher first needs to fetch that data from some data source.

The Stork Data Provider is an app that lets users configure a list of data feeds from various sources which they would like to output. These data streams are output in a format which can be easily received by the Publisher Agent, meaning a user can run the Data Provider alongside the Publisher Agent so that they can source the data, sign it and send it to the Stork Network without writing any code.

It is also an open source framework where users can easily contribute to a collection of data integrations.

See [Stork Data Provider Docs](docs/data_provider.md).
52 changes: 52 additions & 0 deletions apps/docs/data_provider.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Data Provider
The Stork Data Provider is a framework to pull arbitrary numeric data across many sources. It can be used on its own, or run alongside the Stork Publisher Agent to sign the data and send it to the Stork Network.

## Adding a New Data Source
If you want to report data from a data source which does not already have an [integration](../lib/data_provider/sources), you can add your own.

To add a new source:
1. Add a [package](../lib/data_provider/sources/random) in the [sources directory](../lib/data_provider/sources) with your data source's name
1. Run `python3 ./apps/scripts/update_shared_data_provider_code/py` to generate some framework code so that the framework is aware of your new source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbers here seem to have gotten messed up somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're intentionally all 1 so that markdown interprets them as ordered numbers (might be easier to review the markdown preview for this file). This way we can reorder the steps or add/remove steps without needing to update every number

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. Is that a common practice? Not sure I've seen that before

1. Add a [data_source.go](../lib/data_provider/sources/random/data_source.go) and implement a DataSource object conforming to the [DataSource interface](../lib/data_provider/types/model.go). This object will contain most of your source-specific logic, but it can leverage tools like the [scheduler](../lib/data_provider/sources/scheduler.go) or [ethereum_utils](../lib/data_provider/sources/ethereum_utils.go) as needed.
1. Add a [data_source_test.go](../lib/data_provider/sources/random/data_source_test.go) to unit test your data source.
1. Add a [config.go](../lib/data_provider/sources/random/config.go) which defines a configuration object corresponding to a single data feed in your source
1. This config object must include a `DataSource` field.
1. Add a [JSON Schema](https://json-schema.org/) [config](../lib/data_provider/configs/resources/source_config_schemas/random.json) in the configs package defining the structure of the configuration object in [config.go](../lib/data_provider/sources/random/config.go)
1. Add a [config test](../lib/data_provider/configs/source_config_tests/random_test.go) to the configs package which tests that a valid Data Provider config json using your source:
1. Passes schema validations
1. Can be deserialized into your configuration object correctly
1. Can be used to extract your DataSourceId using `GetSourceSpecificConfig`
1. Add an [init.go](../lib/data_provider/sources/random/init.go) to your package. This file can be almost identical for every source. This file is responsible for:
1. Defining the DataSourceId variable for this source (which must be the same as the package name)
1. Defining and registering a DataSourceFactory (which will just call to your DataSource constructor)
1. Asserting the source's DataSource and DataSourceFactory satisfy our interfaces
1. Defining a function to deserialize the source's config object
1. Submit a Pull Request so other developers can use your new data source!

## Configuration
The Data Provider can report many feeds, each sourced from any of the data sources implemented in [sources](../lib/data_provider/sources).

You can configure the Data Provider by passing it a [config json file](../../sample.data-provider.config.json) which can be deserialized into a [DataProviderConfig](../lib/data_provider/types/model.go) object.

The `sources` tag is a list of configurations for different feeds, where each feed has a unique `id` and a `config` which can be deserialized into the appropriate [source config](../lib/data_provider/sources/random/config.go).

## Running Local Code
You can test the Data Provider locally by running:
```
go run apps/cmd/data_provider/main.go start -c ./sample.data-provider.config.json --verbose
```
You will most likely want to replace the `./sample.data-provider.config.json` with a more useful config json. Also make sure any required environment variables like API keys are set in your local environment.

Running in `--verbose` mode with no output address set will just log every price update. If you want to actually send updates somewhere (like the websocket server of your local Publisher Agent), you can pass an output address flag:
```
go run apps/cmd/data_provider/main.go start -c ./sample.data-provider.config.json -o ws://localhost:5216/
```

## Running Published Docker Image
If all the data sources you want to use are already merged into Stork's repo, you can just pull the latest published Data Provider docker image and supply your own config:
```
docker run --platform linux/arm64 --pull always --restart always --name data-provider -v ./sample.data-provider.config.json:/etc/config.json -d --log-opt max-size=1g storknetwork/data-provider:v1.0.4 start -c /etc/config.json -o ws://localhost:5216/
```



10 changes: 5 additions & 5 deletions apps/lib/data_provider/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,18 +18,18 @@ var DataProviderCmd = &cobra.Command{

// required
const ConfigFilePathFlag = "config-file-path"
const WebsocketUrl = "ws-url"
const OutputAddressFlag = "output-address"

func init() {
DataProviderCmd.Flags().StringP(ConfigFilePathFlag, "c", "", "the path of your config json file")
DataProviderCmd.Flags().StringP(WebsocketUrl, "w", "", "the websocket url to write updates to")
DataProviderCmd.Flags().StringP(OutputAddressFlag, "o", "", "a string representing an output address (e.g. ws://localhost:5216/)")

DataProviderCmd.MarkFlagRequired(ConfigFilePathFlag)
}

func runDataProvider(cmd *cobra.Command, args []string) error {
configFilePath, _ := cmd.Flags().GetString(ConfigFilePathFlag)
wsUrl, _ := cmd.Flags().GetString(WebsocketUrl)
outputAddress, _ := cmd.Flags().GetString(OutputAddressFlag)

mainLogger := utils.MainLogger()

Expand All @@ -39,12 +39,12 @@ func runDataProvider(cmd *cobra.Command, args []string) error {

mainLogger.Info().Msg("Starting data provider")

config, err := loadConfig(configFilePath)
config, err := LoadConfig(configFilePath)
if err != nil {
return fmt.Errorf("error loading config: %v", err)
}

runner := NewDataProviderRunner(*config, wsUrl)
runner := NewDataProviderRunner(*config, outputAddress)
runner.Run()

return nil
Expand Down
85 changes: 3 additions & 82 deletions apps/lib/data_provider/config.go
Original file line number Diff line number Diff line change
@@ -1,97 +1,18 @@
package data_provider

import (
"embed"
"encoding/json"
"fmt"
"os"

"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/sources"
"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/configs"
"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/types"
"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/utils"
"github.com/xeipuuv/gojsonschema"
)

//go:embed resources
var resourcesFS embed.FS

func loadConfig(configPath string) (*types.DataProviderConfig, error) {
func LoadConfig(configPath string) (*types.DataProviderConfig, error) {
configBytes, err := os.ReadFile(configPath)
if err != nil {
return nil, fmt.Errorf("failed to read config file: %v", err)
}

err = validateConfig(configBytes)
if err != nil {
return nil, fmt.Errorf("config file is invalid: %v", err)
}

var config types.DataProviderConfig
if err := json.Unmarshal(configBytes, &config); err != nil {
return nil, fmt.Errorf("failed to unmarshal config file: %v", err)
}
return &config, nil
}

func validateSourceConfigs(sourceConfigsObj interface{}) error {
sourceConfigs, ok := sourceConfigsObj.([]interface{})
if !ok {
return fmt.Errorf("invalid source configs type: %T", sourceConfigsObj)
}

for _, sourceConfig := range sourceConfigs {
sourceConfigMap, ok := sourceConfig.(map[string]interface{})
if !ok {
return fmt.Errorf("invalid source config type: %v", sourceConfig)
}
dataSourceId := sourceConfigMap["dataSource"].(string)
factory, err := sources.GetDataSourceFactory(types.DataSourceId(dataSourceId))
if err != nil {
return err
}
schema, err := factory.GetSchema()
if err != nil {
return err
}

sourceSpecificConfig := sourceConfigMap["config"]
configLoader := gojsonschema.NewGoLoader(sourceSpecificConfig)
result, err := schema.Validate(configLoader)
if err != nil {
return fmt.Errorf("error validating config: %v", err)
}
if !result.Valid() {
return fmt.Errorf("config is invalid: %v", result.Errors())
}
}

return nil
}

func validateConfig(configBytes []byte) error {
var dataProviderConfig map[string]interface{}
if err := json.Unmarshal(configBytes, &dataProviderConfig); err != nil {
return fmt.Errorf("failed to parse config JSON: %v", err)
}

// validate top level of config
schema, err := utils.LoadSchema("resources/config_schema.json", resourcesFS)
if err != nil {
return fmt.Errorf("error loading schema: %v", err)
}

configLoader := gojsonschema.NewGoLoader(dataProviderConfig)
result, err := schema.Validate(configLoader)
if err != nil {
return fmt.Errorf("error validating config: %v", err)
}
if !result.Valid() {
return fmt.Errorf("config is invalid: %v", result.Errors())
}

err = validateSourceConfigs(dataProviderConfig["sources"])
if err != nil {
return fmt.Errorf("error validating source configs: %v", err)
}
return nil
return configs.LoadConfigFromBytes(configBytes)
}
90 changes: 90 additions & 0 deletions apps/lib/data_provider/configs/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
package configs

import (
"embed"
"encoding/json"
"fmt"
"path/filepath"

"github.com/Stork-Oracle/stork-external/apps/lib/data_provider/types"
"github.com/xeipuuv/gojsonschema"
)

//go:embed resources
var resourcesFS embed.FS

const configSchemaPath = "resources/data_provider_config.schema.json"

// exposed for testing
func LoadConfigFromBytes(configBytes []byte) (*types.DataProviderConfig, error) {
schema, err := loadSchema(resourcesFS)
if err != nil {
return nil, fmt.Errorf("error loading schema: %v", err)
}

err = validateConfig(configBytes, schema)
if err != nil {
return nil, fmt.Errorf("config file is invalid: %v", err)
}

var config types.DataProviderConfig
if err := json.Unmarshal(configBytes, &config); err != nil {
return nil, fmt.Errorf("failed to unmarshal config file: %v", err)
}
return &config, nil
}

func loadSchema(resourcesFS embed.FS) (*gojsonschema.Schema, error) {
schemaContent, err := resourcesFS.ReadFile(configSchemaPath)
if err != nil {
return nil, fmt.Errorf("failed to read schema file for %s: %v", configSchemaPath, err)
}

loader := gojsonschema.NewSchemaLoader()

// add all source schema configs to schema loader
sourceSchemaDir := "resources/source_config_schemas"
sourceSchemaFiles, err := resourcesFS.ReadDir(sourceSchemaDir)
if err != nil {
return nil, err
}
for _, sourceSchemaFile := range sourceSchemaFiles {
sourceSchemaPath := filepath.Join(sourceSchemaDir, sourceSchemaFile.Name())
schemaBytes, err := resourcesFS.ReadFile(sourceSchemaPath)
if err != nil {
return nil, err
}
schemaFileLoader := gojsonschema.NewBytesLoader(schemaBytes)
err = loader.AddSchema(sourceSchemaPath, schemaFileLoader)
if err != nil {
return nil, err
}
}

topLevelSchemaLoader := gojsonschema.NewStringLoader(string(schemaContent))

schema, err := loader.Compile(topLevelSchemaLoader)
if err != nil {
return nil, fmt.Errorf("failed to parse schema for %s: %v", configSchemaPath, err)
}

return schema, nil
}

func validateConfig(configBytes []byte, schema *gojsonschema.Schema) error {
var dataProviderConfig map[string]interface{}
if err := json.Unmarshal(configBytes, &dataProviderConfig); err != nil {
return fmt.Errorf("failed to parse config JSON: %v", err)
}

configLoader := gojsonschema.NewGoLoader(dataProviderConfig)
result, err := schema.Validate(configLoader)
if err != nil {
return fmt.Errorf("error validating config: %v", err)
}
if !result.Valid() {
return fmt.Errorf("config is invalid: %v", result.Errors())
}

return nil
}
Loading
Loading