You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Further investigate the Spacewalk Kubernetes issue of stuck connection to Stellar overlay network in a timebox (at most 2 days)
Kubernetes local deployment of both the runner and standalone vault binary works without issues.
Requirement
Compare configuration JSON between EC2 and EKS deployments
Compare vault images/versions between EC2 and EKS deployments
Update tokio library to the latest version
Findings
Configuration JSON files and Docker images used are the same between EC2 and EKS deployments, but the issue is still present.
Even after upgrading tokio library to the latest version (1.35), the issue was still present.
Extended testing
However, I tested a similar Kubernetes setup from scratch in both EKS and GKE (on free-tier accounts, unrelated to our org).
Testing setup
Kubernetes spec files used in the tests are defined here. Instead of building the runner from scratch, I used the same runner binary present in our production deployments on EC2 and EKS.
Tested/Checked the following:
Increasing/removing resource limits
Allowing all network traffic
Creating multiple Docker images for the runner with different dependencies and different distros as base image (ubuntu:focal, ubuntu:latest, alpine:latest)
Same Linux kernel version in our production EC2 and EKS cluster as on free-tier EKS and GKE clusters
Results
Issue was still present in all tests mentioned above.
After code changes that replaced tokio with async-std, the issue still persisted when using the runner but everything was working fine when running the standalone vault binary in Kubernetes.
Conclusion is that the issue is coming from the runner code as described in this ticket.
The text was updated successfully, but these errors were encountered:
Context
Further investigate the Spacewalk Kubernetes issue of stuck connection to Stellar overlay network in a timebox (at most 2 days)
Kubernetes local deployment of both the
runner
and standalonevault binary
works without issues.Requirement
tokio
library to the latest versionFindings
Configuration JSON files and Docker images used are the same between EC2 and EKS deployments, but the issue is still present.
Even after upgrading
tokio
library to the latest version (1.35), the issue was still present.Extended testing
However, I tested a similar Kubernetes setup from scratch in both EKS and GKE (on free-tier accounts, unrelated to our org).
Testing setup
Kubernetes spec files used in the tests are defined here. Instead of building the
runner
from scratch, I used the same runner binary present in our production deployments on EC2 and EKS.Tested/Checked the following:
runner
with different dependencies and different distros as base image (ubuntu:focal
,ubuntu:latest
,alpine:latest
)Results
Issue was still present in all tests mentioned above.
After code changes that replaced
tokio
withasync-std
, the issue still persisted when using therunner
but everything was working fine when running the standalonevault binary
in Kubernetes.Conclusion is that the issue is coming from the
runner
code as described in this ticket.The text was updated successfully, but these errors were encountered: