care partner alerts #715

ewollesen · 2024-05-06T15:55:54Z

This used to be a series of PRs, but that didn't really work out. They're all collapsed into this one.

Shouldn't be merged until tidepool-org/go-common#71 is merged, then this should have it's go-common bumped.

ewollesen · 2024-07-11T19:11:07Z

To use this in QA, it must be paired with tidepool-org/hydrophone#145 and tidepool-org/go-common#71

toddkazakov

Looks good overall, but the retry mechanism which is implemented here doesn't satisfy the latency requirements. The current implementation is ok for internal usage, but it's not production ready. This could be handled in a separate PR if this makes the development and QA process easier.

auth/store/mongo/device_tokens_repository.go

auth/store/test/device_token_repository.go

data/events/alerts.go

toddkazakov · 2024-07-12T12:56:39Z

data/events/events.go

+	}
+	handler := asyncevents.NewSaramaConsumerGroupHandler(&asyncevents.NTimesRetryingConsumer{
+		Consumer: r.Config.MessageConsumer,
+		Delay:    CappedExponentialBinaryDelay(AlertsEventRetryDelayMaximum),


I don't think this is a suitable retry strategy given the latency requirements for this service. Kafka's consumer group concurrency is limited to the number of partitions of the topic. This number cannot be very high because Kafka's memory consumption grows linearly with the number of partitions. From this follows that the number of partitions is much lower than the number of users we will have and the data of multiple users will end up in the same partition. A failure to evaluate a single user's alerts for one minute as currently set by the CappedExponentialBinaryDelay will introduce at least a minute delay to all of the users sharing the same partition, because messages in a single partition are processed serially.

Alert notifications should be near real-time - up to 10 seconds latency is acceptable. I think the solution proposed in this design document is how this should be handled. Other solutions which satisfy the requirements are welcome.

This will require some more in-depth thought on my part... Will do.

Yeah, I think you're right, let's get this review merged, and I'll work on getting a multiple topic solution set up. Given the flexibility we have now, it shouldn't be too bad.

I have the multi-tier retry in the eric-alerts-multi-topic-retry branch.

This should be implemented in this branch now.

data/events/events.go

data/service/service/standard.go

ewollesen · 2024-10-07T19:28:03Z

@toddkazakov I just removed two config env vars. I believe we talked about that before, but it slipped my mind until I was reviewing the helm chart changes today, where they came up again.

So the re-review here is just around the config parsing, in the most recent commit of the PR, nothing else is changed. [UPDATE] this comment is outdated.

This functionality will be used by care partner processes to retrieve device tokens in order to send mobile device push notifications in response to care partner alerts being triggered. BACK-2554

This was missed when moving device tokens from the data service to the auth service in commit a0f5a84. BACK-2554

Basic steps are taken to allow for other push notification services to be easily added in the future. BACK-2554

So that sarama log messages better follow our standards, and will be emitted as JSON when log.Logger is configured for that. Before this change, the sarama logs were printed in plain-text without any of the benefits of the platform log.Logger. BACK-2554

The existing FaultTolerantConsumer isn't used because it's retry semantics are hard-wired and aren't compatible with what care partner alerting's needs. Note: A proper implementation of AlertsEventsConsumer to consume events is yet to be written. It will follow shortly. BACK-2554

The upload id is necessary to ensure that only the proper device data uploads are evaluated for care partner alert conditions. BACK-2554

If the necessary configuration isn't found, then push notifications will instead be logged. BACK-2554

These methods return Note objects that can be sent as push notifications. NotLooping evaluation will be handled in a later commit. BACK-2554

It uses the new asyncevents from go-common, as alerts processing requires different retry semantics than the existing solution. The Pusher interface is moved out of data/service into data/events to avoid a circular dependency. BACK-2554

No longer needed

In response to request during code review.

As caught by Todd in code review. BACK-2554

When a care partner alert encounters an error, the message is moved to a separate topic that will cause it to be retried after a delay. Any number of these topics can be configured. BACK-2499

Instead of a static delay, uses a "not before" time found in a Kafka message header. Consumption of the message will not be attempted until the time has passed. This allows for more accurate delays, as the time required to process an earlier message doesn't further delay the current message's processing. BACK-2449

These won't be changing at runtime, so there's no need to complicate the initialization by making these configurable. The topic's prefix is configurable, and that's the part that will change from environment to environment at runtime. BACK-2554

A rebase has picked up work performed by Darin, which removes the need for this token injection. \o/ Yay!

These tests, and the functionality they cover were moved into the alerts/client.go in a previous commit.

BACK-2449

BACK-2499

- UsersWithoutCommunication endpoint added to data service - UsersWithoutCommunication endpoint added to alerts client - implementing no communication alerts via the task service - evaluation of alerts conditions re-worked - The new system recognizes that some alerts are generated by events (so-called "Data Alerts") while others are polled (no communication). - The new evaluator lives in the alerts package (was data/events) - implemented tracking of sent notifications - Recording repo is implemented to record/index the time of the last received data from a user BACK-2558

I've identified that there are some big changes that will need to happen in order to manage marking things sent and resolved. Those will come in a future commit. BACK-2559

The above turned out to be a lot more complicated than I had imagined they'd be. BACK-2559

ewollesen · 2025-01-28T15:27:05Z

Ready for review once again.

As requested in code review. tidepool-org/terraform-modules#72 (review) BACK-2559

BACK-2559 BACK-2499

toddkazakov · 2025-02-05T10:13:27Z

data/store/mongo/mongo.go

@@ -66,3 +71,13 @@ func (s *Store) NewAlertsRepository() alerts.Repository {
 	r := alertsRepo(*s.Store.GetRepository("alerts"))
 	return &r
 }
+
+func (s *Store) NewRecorderRepository() alerts.RecordsRepository {
+	r := recorderRepo(*s.Store.GetRepository("records"))


It's not obvious what is meant by record. Is there a better name that we can use?

toddkazakov · 2025-02-05T10:41:56Z

alerts/tasks.go

+	now := time.Now()
+	if nextDesiredRun.Before(now) {
+		r.logger.Info("care partner is bumping nextDesiredRun")
+		// nextDesiredRun, when added to time.Now in tsk.RepeatAvailableAfter, must


I suggest changing the code of the queue to not fail the task if available time is in the past. It looks like a safeguard to prevent starvation in case we run tasks in an endless loop in case available time is set incorrectly, but now we have a good reason to remove this code. This way we can remove the workaround you came up with which looks fragile and will terminate the task in case the task goes into failed state.

toddkazakov · 2025-02-05T10:47:28Z

task/carepartner.go

+
+const CarePartnerType = "org.tidepool.carepartner"
+
+func NewCarePartnerTaskCreate() *TaskCreate {


You should move this to the alerts package.

toddkazakov · 2025-02-05T10:47:56Z

task/carepartner_test.go

+	. "github.com/onsi/gomega"
+)
+
+var _ = Describe("NewCarePartnerTaskCreate", func() {


You should move this to the alerts package.

toddkazakov · 2025-02-05T10:53:51Z

data/service/service/standard.go

+		},
+	}
+
+	retryDelays := []time.Duration{0, 1 * time.Second}


Retry delays should be configurable with an env variable in case we want to adjust them, instead of having to modify the code.

toddkazakov · 2025-02-05T11:19:10Z

data/store/mongo/mongo_recorder.go

+	}
+}
+
+func (d *recorderRepo) UsersWithoutCommunication(ctx context.Context) ([]alerts.LastCommunication, error) {


Doesn't return "users", consider renaming this.

toddkazakov · 2025-02-05T11:19:38Z

data/store/mongo/mongo_recorder.go

+	structuredmongo "github.com/tidepool-org/platform/store/structured/mongo"
+)
+
+// recorderRepo implements RecorderRepository, writing data to a MongoDB collection.


Rename this to something more descriptive

toddkazakov · 2025-02-05T11:21:07Z

alerts/config.go

+	ctx = log.NewContextWithLogger(ctx, lgr)
+	nc := c.Alerts.NoCommunication.Evaluate(ctx, last)
+	needsUpsert := c.Activity.NoCommunication.Update(nc.OutOfRange)
+	// TODO check re-eval? I don't think so


Open a ticket if this needs to be addressed and remove the TODO comment

toddkazakov · 2025-02-05T11:24:56Z

alerts/config.go

 	"time"

 	"github.com/tidepool-org/platform/data"
-	"github.com/tidepool-org/platform/data/blood/glucose"
+	nontypesglucose "github.com/tidepool-org/platform/data/blood/glucose"


Rename this (and other occurrences) to dataBloodGlucose for consistency. I don't think nontypesglucose has any benefit over the existing alias used throughout the repository.

toddkazakov · 2025-02-05T11:28:05Z

alerts/config.go

+}
+
+// RecordsRepository encapsulates queries of the records collection for use with alerts.
+type RecordsRepository interface {


I think it's important to capture what type of "communications" this repository stores and that those are used solely for alerting purposes. This applies to the struct, interface and collection. Unfortunately, I can't come up with a good suggestion myself.

ewollesen requested a review from toddkazakov May 6, 2024 15:55

ewollesen mentioned this pull request May 6, 2024

lift Repeat out of the base alert config #716

Closed

ewollesen removed the request for review from toddkazakov May 8, 2024 19:59

ewollesen force-pushed the eric-cpa-alerts branch from 374bae2 to 2deda23 Compare June 6, 2024 17:52

ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from 8549c33 to 8367902 Compare June 24, 2024 19:09

ewollesen force-pushed the eric-cpa-alerts branch from 8367902 to 2ea9686 Compare July 9, 2024 22:43

ewollesen changed the title ~~adds List and Get methods to alerts client~~ minimal implementation of care partner alerts Jul 9, 2024

ewollesen requested a review from toddkazakov July 9, 2024 22:50

ewollesen force-pushed the eric-cpa-alerts branch from 2ea9686 to 7246848 Compare July 10, 2024 17:31

ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from c50d589 to 986106b Compare July 11, 2024 22:55

ewollesen mentioned this pull request Jul 11, 2024

add evaluation of not looping alerts #751

Closed

toddkazakov reviewed Jul 12, 2024

View reviewed changes

ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from c08e1fc to 967c617 Compare July 19, 2024 16:02

ewollesen requested a review from toddkazakov July 29, 2024 22:41

ewollesen force-pushed the eric-cpa-alerts branch from 967c617 to cd42b19 Compare July 29, 2024 22:44

ewollesen force-pushed the eric-cpa-alerts branch 2 times, most recently from 9432468 to eaa652e Compare September 17, 2024 20:07

toddkazakov previously approved these changes Sep 18, 2024

View reviewed changes

ewollesen dismissed toddkazakov’s stale review via ee5da4a October 2, 2024 17:56

ewollesen force-pushed the eric-cpa-alerts branch from d6449a1 to ee5da4a Compare October 2, 2024 17:56

toddkazakov previously approved these changes Oct 5, 2024

View reviewed changes

ewollesen dismissed toddkazakov’s stale review via 8ae1dc8 October 7, 2024 19:25

ewollesen requested a review from toddkazakov October 7, 2024 19:26

ewollesen force-pushed the eric-cpa-alerts branch from 8ae1dc8 to 28fdf06 Compare October 31, 2024 15:40

ewollesen force-pushed the eric-cpa-alerts branch from 28fdf06 to b9767dc Compare December 11, 2024 19:54

ewollesen added 24 commits January 28, 2025 08:22

adds the ability to retrieve device tokens to the auth client

2eb7670

This functionality will be used by care partner processes to retrieve device tokens in order to send mobile device push notifications in response to care partner alerts being triggered. BACK-2554

remove unused device tokens repo from data

451d063

This was missed when moving device tokens from the data service to the auth service in commit a0f5a84. BACK-2554

adds a pusher client for sending APNs push notifications

cb7145d

Basic steps are taken to allow for other push notification services to be easily added in the future. BACK-2554

allow invites to set an upload id

d1fe15b

The upload id is necessary to ensure that only the proper device data uploads are evaluated for care partner alert conditions. BACK-2554

integrates an APNs pusher into data service

a8d6148

If the necessary configuration isn't found, then push notifications will instead be logged. BACK-2554

adds Evaluate methods to alerts.Config

f44db4f

These methods return Note objects that can be sent as push notifications. NotLooping evaluation will be handled in a later commit. BACK-2554

remove some debugging logs

6313318

No longer needed

small fixes from code review

8955e56

rename Note => Notification

4e0981f

In response to request during code review.

one mock of DeviceTokenRepository is enough

af60d8c

As caught by Todd in code review. BACK-2554

add a topic cascading retry mechanism for care partner alerts

3d8f19f

When a care partner alert encounters an error, the message is moved to a separate topic that will cause it to be retried after a delay. Any number of these topics can be configured. BACK-2499

just a little more explanation of cascading consumer

329ca63

there's no longer a need to inject server session tokens

3846520

A rebase has picked up work performed by Darin, which removes the need for this token injection. \o/ Yay!

removes out-of-date tests

38d1c85

These tests, and the functionality they cover were moved into the alerts/client.go in a previous commit.

improve test coverage

19bdcda

BACK-2449

add data set id to alerts Evaluation, improve test coverage

a2be12d

BACK-2499

evaluate not looping conditions part 1

7a81670

I've identified that there are some big changes that will need to happen in order to manage marking things sent and resolved. Those will come in a future commit. BACK-2559

re-working to handle alert resolution and sent tracking

9562b32

The above turned out to be a lot more complicated than I had imagined they'd be. BACK-2559

ewollesen force-pushed the eric-cpa-alerts branch from 2e91daf to 9562b32 Compare January 28, 2025 15:25

ewollesen added 2 commits February 4, 2025 14:57

reduce kafka topics for care partner alerts outside of production

239f89c

As requested in code review. tidepool-org/terraform-modules#72 (review) BACK-2559

bump go-common to get kafka CDC updates for CPA

f49218a

BACK-2559 BACK-2499

ewollesen force-pushed the eric-cpa-alerts branch from 099bf63 to f49218a Compare February 4, 2025 22:54

toddkazakov requested changes Feb 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

care partner alerts #715

care partner alerts #715

ewollesen commented May 6, 2024 •

edited

Loading

ewollesen commented Jul 11, 2024 •

edited

Loading

toddkazakov left a comment

toddkazakov Jul 12, 2024

ewollesen Jul 12, 2024

ewollesen Jul 19, 2024

ewollesen Jul 29, 2024

ewollesen Dec 11, 2024

ewollesen commented Oct 7, 2024 •

edited

Loading

ewollesen commented Jan 28, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025

toddkazakov Feb 5, 2025


		const CarePartnerType = "org.tidepool.carepartner"

		func NewCarePartnerTaskCreate() *TaskCreate {

care partner alerts #715

Are you sure you want to change the base?

care partner alerts #715

Conversation

ewollesen commented May 6, 2024 • edited Loading

ewollesen commented Jul 11, 2024 • edited Loading

toddkazakov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ewollesen commented Oct 7, 2024 • edited Loading

ewollesen commented Jan 28, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ewollesen commented May 6, 2024 •

edited

Loading

ewollesen commented Jul 11, 2024 •

edited

Loading

ewollesen commented Oct 7, 2024 •

edited

Loading