-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed errors returned when key-value store is locked #232
Comments
Could this be related at all to moby/libnetwork#1950 ? |
Looking at the code here, I strongly suspect this is actually another instance of etcd-io/bbolt#122, which fixes moby/libnetwork#1950. This repo uses libnetwork, so it would make sense that the two things are related. libnetwork in turn uses libkv. So far:
Once moby/libnetwork#2268 is merged, someone needs to fix up <insert component, CNI - where is that repo - @PatrickLang any idea?> to move to the updated libnetwork to fix this. |
@dineshgovindasamy @madhanrm Are there any other CNI fixes needed here to move to updated libnetwork as @jhowardmsft indicates? |
Resolved in #247, merged into acs-engine Azure/acs-engine#3989 |
Is this a request for help?: No
Is this an ISSUE or FEATURE REQUEST? (choose one): Issue
Which release version?: v1.0.11
Which component (CNI/IPAM/CNM/CNS): CNI
Which Operating System (Linux/Windows): Windows Server version 1803
Which Orchestrator and version (e.g. Kubernetes, Docker): Kubernetes v1.10.6
What happened:
When I try to scale up to several pods on the same node, there will be some store locking issues as overlapped calls are made to the Azure CNI binary.
What you expected to happen:
Instead of seeing
E0815 21:11:45.950994 4188 cni.go:259] Error adding network: netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
after CNI failed to get the lock, I would expect a valid error.I looked at the source:
azure-container-networking/cni/network/plugin/main.go
Line 92 in 677d471
and it has return code 1 with nothing on stderr
What I would expect is a return code >100 (vendor specific), along with this json schema on stdout:
How to reproduce it (as minimally and precisely as possible):
Try to scale anything up on Windows. It will eventually succeed, but there will be errors in the process as multiple pods need addEndpoint called simultaneously.
The text was updated successfully, but these errors were encountered: