chore: extract deal observer loop #42

NikolasHaimerl · 2025-01-28T15:21:59Z

This PR introduces the following changes:

Extract the logic of the deal observer loop into its own file
Add a test for the deal observer loop
Extract the mocked rpc endpoint into its own file for easier reuse.

bajtos

Great idea to move the loop functions from bin/ to lib/!

backend/test/deal-observer.test.js

backend/lib/loops.js

juliangruber · 2025-01-29T08:43:24Z

backend/bin/deal-observer-backend.js

+  finalityEpochs,
+  LOOP_INTERVAL,
+  INFLUXDB_TOKEN,
+  signal


This method has a lot of arguments, working with positional arguments is hard in this case. Imagine using an editor that doesn't infer types, or if we later on change the arguments list.

What do you think about changing this to an options object?

+1 to use an options object (named parameters).

@bajtos isn't this in contrast to what you mentioned in an earlier PR?

In the earlier PR @bajtos also suggested to use an options object / param object / named parameters

I see, I must have misunderstood then.

@bajtos isn't this in contrast to what you mentioned in an earlier PR?

It's great that you asked. As Julian pointed out, I meant the same thing in that other comment.

JavaScript language does not provide syntax for named parameters. We implement named parameters by passing around an object where each property corresponds to one parameter.

// definition function doSomeStuff({ pgClient, signal }) { // ... } // example usage const pgClient = /*..*/ doSomeStuff({ pgClient, signal: AbortSignal.timeout(10_000) })

The TypeScript typings are slightly more complex - you need to describe the single positional parameter accepting an object, and then describe the properties of that object.

/** * @param {object} params * @param {pg.Client} params.pgClient * @param {AbortSignal} signal */ function doSomeStuff({ pgClient, signal }) { // ... }

I personally tend to call the object argument as args, see e.g. here:

https://github.com/CheckerNetwork/piece-indexer/blob/765dbfe7705ce375f7b5a317f4912264e7a70ffa/indexer/lib/ipni-watcher.js#L11-L16

/** * @param {object} args * @param {number} args.minSyncIntervalInMs * @param {AbortSignal} [args.signal] */ export async function * runIpniSync ({ minSyncIntervalInMs, signal }) { // ... }

The syntax [args.signal] tells the type-checker that signal is an optional parameter.

I am not sure I quite understand. How does
({minSyncIntervalInMs, signal}) differ from (minSyncIntervalInMs, signal) for readability. Or do you mean to have a single argument (args) and then call args.pgPool?

Depending on the calling situation, you might not have named arguments. So, the second one quickly becomes (1000, signal) vs the first one is always clear: ({ minSyncIntervalInMs: 1000, signal })

Arguments order doesn't matter with an object, ({ a, b }) works just like ({ b, a })

There are probably more factors here, but these are the ones most important in my mind

juliangruber · 2025-01-29T08:45:07Z

backend/test/utils.js

+import { chainHeadTestData } from './test_data/chainHead.js'
+import { rawActorEventTestData } from './test_data/rawActorEvent.js'
+
+export const makeRpcRequest = async (method, params) => {


I'm not sure this will scale well. @bajtos what do you think about this helper? One test might need a different test data set being returned than another. My suspicion is that it will scale better to define this makeRpcRequest function where the tests are also defined (and potentially doing it multiple times).

I share your concerns and agree it's best to define the stub makerRpcRequest function in every test.

We can implement a builder helper to make it easier to implement such stubs.

// usage in tests const makeRpcRequest = buildMakeRpcRequest({ 'Filecoin.ChainHead': chainHeadTestData }) // implementation export const buildMakeRpcRequest = (methodsToResults) => { return async (method, _params) => { if (methodsToResults[method]) return methodsToResults[method] throw new Error(`Unsupported RPC API method: "${method}"`) } }

Isn"t this adding quite a bit of complexity for a simple problem?
We do not need this complexity right now, and it is also not foreseeable whether we will need it in the future. I would much rather add complexity when we need it instead of before.

Wouldn't it be less complex to inline this function (without the helper), than creating a new utility (i.e. a new abstraction)

bajtos · 2025-01-29T09:30:27Z

backend/bin/deal-observer-backend.js

+  finalityEpochs,
+  LOOP_INTERVAL,
+  INFLUXDB_TOKEN,
+  signal


+1 to use an options object (named parameters).

backend/test/deal-observer.test.js

bajtos · 2025-01-29T09:36:17Z

backend/test/deal-observer.test.js

+  after(async () => {
+    await pgPool.query('DELETE FROM active_deals')
+  })


Would you mind not cleaning the DB after tests in this pull request and waiting until we resolve the discussion in #41?

backend/test/deal-observer.test.js

bajtos · 2025-01-29T09:52:25Z

backend/test/deal-observer.test.js

+    do {
+      rows = (await pgPool.query('SELECT * FROM active_deals')).rows
+    } while (rows.length !== 360)


Nitpick - feel free to ignore.

Fetching data for all rows is not a very efficient way to count the number of rows in a table. It does not matter in this test, since you are fetching up to 360 rows, but it could become a problem if used elsewhere.

A more efficient solution is to use the SQL aggregate function COUNT().

while (true) { const { rows } = await pgPool.query('SELECT COUNT(*) FROM active_deals') if (rows[0].count === 360) break }

bajtos · 2025-01-29T09:56:53Z

backend/test/utils.js

+import { chainHeadTestData } from './test_data/chainHead.js'
+import { rawActorEventTestData } from './test_data/rawActorEvent.js'
+
+export const makeRpcRequest = async (method, params) => {


I share your concerns and agree it's best to define the stub makerRpcRequest function in every test.

We can implement a builder helper to make it easier to implement such stubs.

// usage in tests const makeRpcRequest = buildMakeRpcRequest({ 'Filecoin.ChainHead': chainHeadTestData }) // implementation export const buildMakeRpcRequest = (methodsToResults) => { return async (method, _params) => { if (methodsToResults[method]) return methodsToResults[method] throw new Error(`Unsupported RPC API method: "${method}"`) } }

backend/test/utils.js

pyropy · 2025-01-29T13:16:50Z

backend/lib/deal-observer-loop.js

+ * @param {string | undefined} influxToken
+ * @returns {Promise<void>}
+ * */
+export const dealObserverLoop = async (makeRpcRequest, pgPool, recordTelemetry, Sentry, maxPastEpochs, finalityEpochs, loopInterval, influxToken, signal) => {


Do you think that extracting the loop logic to a separate function could be a good idea? I had something like this in mind:

// lib/loop.js const loop = async ({ signal, name, interval, fn, recordTelemetry, captureException }) => { const submitDeals = submitDealsToSparkApi(sparkApiBaseURL, dealIngestionAccessToken) while (!signal?.aborted) { const start = Date.now() try { await fn() } catch (e) { console.error(e) captureException(e) } const dt = Date.now() - start console.log(`Loop "${name}" took ${dt}ms`) recordTelemetry(`loop_${slug(name, '_')}`, point => { point.intField('interval_ms', name) point.intField('duration_ms', dt) }) if (dt < interval) { await timers.setTimeout(interval - dt) } } } // lib/deal-observer-loop.js const dealObserverLoop = (pgPool, rpcClient, opts) => async () => { // logic goes here } // lib/deal-submitter-loop.js const dealSubmitLoop = (pgPool, opts) => async () => { // logic goes here } // bin/deal-observer-backend.js await Promise.all([ loop({ name: 'dealObserver', interval: 1000, fn: dealObserverLoop(pgPool, rpcClient, opts), recordTelemetry: console.log, captureException: console.error, }), loop({ name: 'dealSubmit', interval: 1000, fn: dealSubmitLoop(pgPool, opts), recordTelemetry: console.log, captureException: console.error, }) ])

I feel that something like this could help us to keep code more DRY, while also keeping the loop function itself clean as we don't have to pass so many arguments around (sentry, influx, and other dependencies).

This is also a valid approach. Given that all loops follow the same overall structure outside of await fn()

@bajtos @juliangruber I'd appreciate your input on this as I'd like to apply changes to #33 as well

As discussed, #33 can be landed without the loop refactor (can be followed up with)

Nikolas Haimerl added 3 commits January 28, 2025 15:55

extracted loop

c606615

add test for observer loop

c7c07c9

add new files

d3306ef

NikolasHaimerl requested a review from bajtos as a code owner January 28, 2025 15:22

fmt

c2b9982

NikolasHaimerl requested a review from juliangruber January 28, 2025 15:24

Merge branch 'main' into nhaimerl-extract-deal-observer-loop

5e61abf

bajtos requested changes Jan 28, 2025

View reviewed changes

backend/test/deal-observer.test.js Outdated Show resolved Hide resolved

backend/lib/loops.js Outdated Show resolved Hide resolved

bajtos mentioned this pull request Jan 28, 2025

Deal submissions #33

Merged

Nikolas Haimerl added 4 commits January 29, 2025 07:28

add abort signal

19a5fcc

renamed loop name

3ef6e23

renamed loop name

955a44b

merged with main

fc63850

NikolasHaimerl requested a review from bajtos January 29, 2025 06:32

juliangruber requested changes Jan 29, 2025

View reviewed changes

bajtos requested changes Jan 29, 2025

View reviewed changes

pyropy reviewed Jan 29, 2025

View reviewed changes

Nikolas Haimerl added 5 commits January 29, 2025 14:23

promise wait all

7fe1dee

promise wait all

776b8e7

remove utils

0eacf82

throw error on unexpected method

2b6d959

formatting

97e1739

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: extract deal observer loop #42

chore: extract deal observer loop #42

NikolasHaimerl commented Jan 28, 2025

bajtos left a comment

juliangruber Jan 29, 2025

bajtos Jan 29, 2025

NikolasHaimerl Jan 29, 2025

juliangruber Jan 29, 2025

NikolasHaimerl Jan 29, 2025

bajtos Jan 29, 2025

bajtos Jan 29, 2025

NikolasHaimerl Jan 29, 2025 •

edited

Loading

juliangruber Jan 29, 2025

juliangruber Jan 29, 2025

bajtos Jan 29, 2025 •

edited

Loading

NikolasHaimerl Jan 29, 2025 •

edited

Loading

juliangruber Jan 29, 2025 •

edited

Loading

bajtos Jan 29, 2025

bajtos Jan 29, 2025

bajtos Jan 29, 2025

bajtos Jan 29, 2025 •

edited

Loading

pyropy Jan 29, 2025

NikolasHaimerl Jan 29, 2025

pyropy Jan 29, 2025 •

edited

Loading

juliangruber Jan 30, 2025

chore: extract deal observer loop #42

Are you sure you want to change the base?

chore: extract deal observer loop #42

Conversation

NikolasHaimerl commented Jan 28, 2025

bajtos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NikolasHaimerl Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

NikolasHaimerl Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

juliangruber Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bajtos Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pyropy Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NikolasHaimerl Jan 29, 2025 •

edited

Loading

bajtos Jan 29, 2025 •

edited

Loading

NikolasHaimerl Jan 29, 2025 •

edited

Loading

juliangruber Jan 29, 2025 •

edited

Loading

bajtos Jan 29, 2025 •

edited

Loading

pyropy Jan 29, 2025 •

edited

Loading