Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-18981 Add configuration to disable session heartbeats #5384

Merged
merged 28 commits into from
Jan 17, 2024

Conversation

charlieR3
Copy link
Contributor

@charlieR3 charlieR3 commented Jan 5, 2024

This PR introduces a configuration switch to enable/disable heartbeats. Session health is determined by heartbeats while enabled and by time between sending and acknowledging messages when disabled. Since heartbeats are not the only method of determining session health,the heart beat manager has been renamed to the session health manager and withing that manager class, the configuration switches between two implementations of session health checks.

Changes worth noting:

  • Heartbeat manager repurposed as a more general session health manager
  • Heartbeat config added to session manager config so that existing sessions can be cleared when the config changes
  • Heartbeat config added to the session health manager (previously heartbeat manager) config so that the session health check method can be switched on a config change.
  • Extracted heartbeating logic to a heartbeat session health monitor class and added similar message ack session health monitor class to monitor session health based on whether message acks were received in a timely manner.

Large Network Test Metrics

Run 1

  • Heartbeats disabled for full run
  • 5 members per cluster
  • 2 clusters
  • bilateral settlement app
  • 30 minute flow execution
  • 5 flows per second
  • 60 warm up flows

Cluster 1
Screenshot 2024-01-16 at 11 09 54

Cluster 2
Screenshot 2024-01-16 at 13 46 52

Run 2

  • Heartbeats enabled for full run
  • 5 members per cluster
  • 2 clusters
  • bilateral settlement app
  • 30 minute flow execution
  • 5 flows per second
  • 60 warm up flows

Cluster 1
Screenshot 2024-01-16 at 11 48 45

Cluster 2
Screenshot 2024-01-16 at 13 44 51

Run 3
In this run, members were onboarded with heartbeats enabled, then heart beats were turned of and 60 flows were run sequentially. Then heartbeats were enabled, followed by 60 flows, then disabled followed by 60 flows, and finally enabled once more followed by 60 flows. Flows were not running while switching the config.
Cluster 1
Screenshot 2024-01-16 at 13 30 56

Cluster 2
Screenshot 2024-01-16 at 13 31 25

@corda-jenkins-ci02
Copy link
Contributor

corda-jenkins-ci02 bot commented Jan 8, 2024

Jenkins build for PR 5384 build 35

Build Successful:
Jar artifact version produced by this PR: 5.2.0.0-alpha-1705486877873
Helm chart version produced by this PR: 5.2.0-alpha.1705486877873
Helm chart pushed to: oci://corda-os-docker-dev.software.r3.com/helm-charts/pr-5384/corda

@charlieR3 charlieR3 marked this pull request as ready for review January 16, 2024 15:17
@charlieR3 charlieR3 requested a review from a team as a code owner January 16, 2024 15:17
yift-r3
yift-r3 previously approved these changes Jan 16, 2024
Copy link
Contributor

@yift-r3 yift-r3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

dimosr
dimosr previously approved these changes Jan 17, 2024
@charlieR3 charlieR3 dismissed stale reviews from dimosr and yift-r3 via 701ccf0 January 17, 2024 09:46
Copy link

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@charlieR3 charlieR3 merged commit 425223e into release/os/5.2 Jan 17, 2024
5 checks passed
@charlieR3 charlieR3 deleted the charlie/CORE-18981 branch January 17, 2024 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants