Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tests/e2e]: Remove all references of busybox in e2e tests #6000

Merged
merged 3 commits into from
Feb 22, 2024

Conversation

shikharish
Copy link
Contributor

Closes #5991

@shikharish shikharish changed the title [tests/e2e]: Remove all references of busybox [tests/e2e]: Remove all references of busybox in e2e tests Feb 18, 2024
@XinShuYang
Copy link
Contributor

TestConnectivity/testOVSRestartSameNode failed in kind e2e test.

Additionally, I'm not sure if it's worth replacing all instances of busybox with toolbox in the e2e test. AFAIK the toolbox instance has larger image size(169MB compared to 1.24MB) and longer time for initialization. cc @antoninbas

@tnqn
Copy link
Member

tnqn commented Feb 20, 2024

Additionally, I'm not sure if it's worth replacing all instances of busybox with toolbox in the e2e test. AFAIK the toolbox instance has larger image size(169MB compared to 1.24MB) and longer time for initialization. cc @antoninbas

@XinShuYang The point of the change is to avoid pulling 2 different images for traffic test. Even busybox is smaller, it causes an extra pull if we keep using it. The linked issue #5991 explains it.

@tnqn
Copy link
Member

tnqn commented Feb 20, 2024

TestConnectivity/testOVSRestartSameNode failed in kind e2e test.

This should be fixed. It may be because arping in busybox is different from the one in toolbox. parseArpingStdout may need some updates

@XinShuYang
Copy link
Contributor

Additionally, I'm not sure if it's worth replacing all instances of busybox with toolbox in the e2e test. AFAIK the toolbox instance has larger image size(169MB compared to 1.24MB) and longer time for initialization. cc @antoninbas

@XinShuYang The point of the change is to avoid pulling 2 different images for traffic test. Even busybox is smaller, it causes an extra pull if we keep using it. The linked issue #5991 explains it.

@tnqn Thanks for the explanation. I am ok with this change on Linux testbed. However, since the toolbox image is currently not supported on Windows, the e2e test also needs to be refactored in this PR.

@tnqn
Copy link
Member

tnqn commented Feb 20, 2024

@tnqn Thanks for the explanation. I am ok with this change on Linux testbed. However, since the toolbox image is currently not supported on Windows, the e2e test also needs to be refactored in this PR.

It has Windows version: antrea-io/image-utils#27

@XinShuYang
Copy link
Contributor

@tnqn Thanks for the explanation. I am ok with this change on Linux testbed. However, since the toolbox image is currently not supported on Windows, the e2e test also needs to be refactored in this PR.

It has Windows version: antrea-io/image-utils#27

That's great! I didn't notice the recent code changes.

@XinShuYang
Copy link
Contributor

/test-windows-containerd-e2e

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one small comment , otherwise LGTM

test/e2e/wireguard_test.go Outdated Show resolved Hide resolved
@antoninbas
Copy link
Contributor

@shikharish from the test logs:

connectivity_test.go:384: Arping test failed: Unexpected arping output

As @tnqn pointed out, parseArpingStdout will probably need some minor update.
If you create a Kind cluster, you can easily run the failing test locally before updating the PR, using something like go test -v -timeout=20m antrea.io/antrea/test/e2e -run=TestConnectivity -provider=kind.

@shikharish
Copy link
Contributor Author

shikharish commented Feb 21, 2024

@antoninbas How can I run the e2e tests on an already running Kind cluster? Simply running the above command gives me an error.

❯ go test -v -timeout=20m antrea.io/antrea/test/e2e -run=TestConnectivity -provider=kind
2024/02/21 12:31:33 Test logs (if any) will be exported under the '/tmp/antrea-test-1409433260' directory
2024/02/21 12:31:33 Creating K8s ClientSet
2024/02/21 12:31:33 Collecting information about K8s cluster
2024/02/21 12:31:33 Error when collecting information about K8s cluster: Get "https://127.0.0.1:39581/version": dial tcp 127.0.0.1:39581: connect: connection refused
FAIL	antrea.io/antrea/test/e2e	0.085s
FAIL

I have a Kind cluster running with a custom config.

@tnqn
Copy link
Member

tnqn commented Feb 21, 2024

Does .kube/config point to the kind cluster? Currently the test code uses that config file. It may be possible to discover the config via env var like kubectl, but not implemented yet.

@shikharish
Copy link
Contributor Author

Got it to work. Thank you!
Fixed the arping output regex.(And tested locally)

@tnqn
Copy link
Member

tnqn commented Feb 21, 2024

"E2e tests on a Kind cluster on Linux for Flow Visibility" failed:

=== RUN   TestFlowAggregator/IPv6/ToExternalEgressOnSourceNode
    flowaggregator_test.go:780: Egress test-egresswrx5f is realized with Egress IP fc00:f853:ccd:e793::2
    flowaggregator_test.go:1156: 
        	Error Trace:	/home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:1156
        	            				/home/runner/work/antrea/antrea/test/e2e/flowaggregator_test.go:788
        	Error:      	Received unexpected error:
        	            	command terminated with exit code 1
        	Test:       	TestFlowAggregator/IPv6/ToExternalEgressOnSourceNode
        	Messages:   	Error when running wget command, stdout: , stderr: ftp://[fc00/f853:ccd:e793::4]:80: Invalid IPv6 numeric address.

It should be related to wget in toolbox. Perhaps it needs an explicit protocol http with IPv6 address.

@shikharish
Copy link
Contributor Author

@tnqn Didn't know that it skipped those tests by default.
I am trying to test them locally but getting a timeout error:

=== RUN   TestFlowAggregatorSecureConnection
2024/02/22 00:45:41 Applying Antrea YAML
2024/02/22 00:45:42 Waiting for all Antrea DaemonSet Pods
2024/02/22 00:45:43 Checking CoreDNS deployment
    fixtures.go:260: Creating 'testflowaggregatorsecureconnection-l9r6tbz3' K8s Namespace
    fixtures.go:281: Error when waiting to get ipfix collector Pod IP: timed out waiting for the condition, Pod.Status: &PodStatus{Phase:Failed,Conditions:[]PodCondition{PodCondition{Type:PodReadyToStartContainers,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:45 +0530 IST,Reason:,Message:,},PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:,Message:,},PodCondition{Type:Ready,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:ContainersReady,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:PodScheduled,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:,Message:,},},Message:,Reason:,HostIP:172.18.0.3,PodIP:172.18.0.3,StartTime:2024-02-22 00:45:43 +0530 IST,ContainerStatuses:[]ContainerStatus{ContainerStatus{Name:ipfix-collector,State:ContainerState{Waiting:nil,Running:nil,Terminated:&ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:,StartedAt:2024-02-22 00:45:43 +0530 IST,FinishedAt:2024-02-22 00:45:43 +0530 IST,ContainerID:containerd://b3e428028ee4b83fc0fbbe9c73976edb33b6d3bbe0a6be8179bde97e8b6fecf1,},},LastTerminationState:ContainerState{Waiting:nil,Running:nil,Terminated:nil,},Ready:false,RestartCount:0,Image:projects.registry.vmware.com/antrea/ipfix-collector:v0.8.2,ImageID:docker.io/library/import-2024-02-21@sha256:354557b4fea5fec82aaf40afacadb507a9842d63abbca0ada18c32d237a54233,ContainerID:containerd://b3e428028ee4b83fc0fbbe9c73976edb33b6d3bbe0a6be8179bde97e8b6fecf1,Started:*false,},},QOSClass:BestEffort,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{PodIP{IP:172.18.0.3,},},EphemeralContainerStatuses:[]ContainerStatus{},}
    flowaggregator_test.go:219: Error when setting up test: timed out waiting for the condition, Pod.Status: &PodStatus{Phase:Failed,Conditions:[]PodCondition{PodCondition{Type:PodReadyToStartContainers,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:45 +0530 IST,Reason:,Message:,},PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:,Message:,},PodCondition{Type:Ready,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:ContainersReady,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:PodScheduled,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:45:43 +0530 IST,Reason:,Message:,},},Message:,Reason:,HostIP:172.18.0.3,PodIP:172.18.0.3,StartTime:2024-02-22 00:45:43 +0530 IST,ContainerStatuses:[]ContainerStatus{ContainerStatus{Name:ipfix-collector,State:ContainerState{Waiting:nil,Running:nil,Terminated:&ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:,StartedAt:2024-02-22 00:45:43 +0530 IST,FinishedAt:2024-02-22 00:45:43 +0530 IST,ContainerID:containerd://b3e428028ee4b83fc0fbbe9c73976edb33b6d3bbe0a6be8179bde97e8b6fecf1,},},LastTerminationState:ContainerState{Waiting:nil,Running:nil,Terminated:nil,},Ready:false,RestartCount:0,Image:projects.registry.vmware.com/antrea/ipfix-collector:v0.8.2,ImageID:docker.io/library/import-2024-02-21@sha256:354557b4fea5fec82aaf40afacadb507a9842d63abbca0ada18c32d237a54233,ContainerID:containerd://b3e428028ee4b83fc0fbbe9c73976edb33b6d3bbe0a6be8179bde97e8b6fecf1,Started:*false,},},QOSClass:BestEffort,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{PodIP{IP:172.18.0.3,},},EphemeralContainerStatuses:[]ContainerStatus{},}
--- FAIL: TestFlowAggregatorSecureConnection (92.16s)
=== RUN   TestFlowAggregator
2024/02/22 00:47:13 Applying Antrea YAML
2024/02/22 00:47:14 Waiting for all Antrea DaemonSet Pods
2024/02/22 00:47:15 Checking CoreDNS deployment
    fixtures.go:260: Creating 'testflowaggregator-20qjai4u' K8s Namespace
    fixtures.go:281: Error when waiting to get ipfix collector Pod IP: timed out waiting for the condition, Pod.Status: &PodStatus{Phase:Failed,Conditions:[]PodCondition{PodCondition{Type:PodReadyToStartContainers,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:17 +0530 IST,Reason:,Message:,},PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:,Message:,},PodCondition{Type:Ready,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:ContainersReady,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:PodScheduled,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:,Message:,},},Message:,Reason:,HostIP:172.18.0.3,PodIP:172.18.0.3,StartTime:2024-02-22 00:47:15 +0530 IST,ContainerStatuses:[]ContainerStatus{ContainerStatus{Name:ipfix-collector,State:ContainerState{Waiting:nil,Running:nil,Terminated:&ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:,StartedAt:2024-02-22 00:47:16 +0530 IST,FinishedAt:2024-02-22 00:47:16 +0530 IST,ContainerID:containerd://a4958932af07e62b29805e2d0f7d5e06df304dde9705fd08532628db3bda446f,},},LastTerminationState:ContainerState{Waiting:nil,Running:nil,Terminated:nil,},Ready:false,RestartCount:0,Image:projects.registry.vmware.com/antrea/ipfix-collector:v0.8.2,ImageID:docker.io/library/import-2024-02-21@sha256:354557b4fea5fec82aaf40afacadb507a9842d63abbca0ada18c32d237a54233,ContainerID:containerd://a4958932af07e62b29805e2d0f7d5e06df304dde9705fd08532628db3bda446f,Started:*false,},},QOSClass:BestEffort,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{PodIP{IP:172.18.0.3,},},EphemeralContainerStatuses:[]ContainerStatus{},}
    flowaggregator_test.go:250: Error when setting up test: timed out waiting for the condition, Pod.Status: &PodStatus{Phase:Failed,Conditions:[]PodCondition{PodCondition{Type:PodReadyToStartContainers,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:17 +0530 IST,Reason:,Message:,},PodCondition{Type:Initialized,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:,Message:,},PodCondition{Type:Ready,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:ContainersReady,Status:False,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:PodFailed,Message:,},PodCondition{Type:PodScheduled,Status:True,LastProbeTime:0001-01-01 00:00:00 +0000 UTC,LastTransitionTime:2024-02-22 00:47:15 +0530 IST,Reason:,Message:,},},Message:,Reason:,HostIP:172.18.0.3,PodIP:172.18.0.3,StartTime:2024-02-22 00:47:15 +0530 IST,ContainerStatuses:[]ContainerStatus{ContainerStatus{Name:ipfix-collector,State:ContainerState{Waiting:nil,Running:nil,Terminated:&ContainerStateTerminated{ExitCode:255,Signal:0,Reason:Error,Message:,StartedAt:2024-02-22 00:47:16 +0530 IST,FinishedAt:2024-02-22 00:47:16 +0530 IST,ContainerID:containerd://a4958932af07e62b29805e2d0f7d5e06df304dde9705fd08532628db3bda446f,},},LastTerminationState:ContainerState{Waiting:nil,Running:nil,Terminated:nil,},Ready:false,RestartCount:0,Image:projects.registry.vmware.com/antrea/ipfix-collector:v0.8.2,ImageID:docker.io/library/import-2024-02-21@sha256:354557b4fea5fec82aaf40afacadb507a9842d63abbca0ada18c32d237a54233,ContainerID:containerd://a4958932af07e62b29805e2d0f7d5e06df304dde9705fd08532628db3bda446f,Started:*false,},},QOSClass:BestEffort,InitContainerStatuses:[]ContainerStatus{},NominatedNodeName:,PodIPs:[]PodIP{PodIP{IP:172.18.0.3,},},EphemeralContainerStatuses:[]ContainerStatus{},}
--- FAIL: TestFlowAggregator (92.19s)
FAIL

The command I ran was ./ci/kind/test-e2e-kind.sh --encap-mode encap --flow-visibility --ip-family dual

@antoninbas
Copy link
Contributor

@shikharish that probably won't be very helpful, but I didn't manage to reproduce locally with the main branch when running ./ci/kind/test-e2e-kind.sh --encap-mode encap --flow-visibility --ip-family dual.
You could sleep in ci/kind/test-e2e-kind.sh before running the test (after setting up the cluster), and then run the test manually to check why the ipfix-collector Pod is not starting correctly.
But if you can't run that test locally easily, it may be easier to just update the code at

if !isIPv6 {
cmd = fmt.Sprintf("wget -O- %s:%d", dstIP, dstPort)
} else {
cmd = fmt.Sprintf("wget -O- [%s]:%d", dstIP, dstPort)
}
and then let CI run the test.
You may just need to explicitly add http:// in front of the IP address, so that's a pretty straightforward change.

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
thanks @shikharish

@tnqn
Copy link
Member

tnqn commented Feb 22, 2024

/test-e2e
/test-windows-e2e
/test-vm-e2e
/test-ipv6-e2e
/test-flexible-ipam-e2e
/test-multicast-e2e
/test-multicluster-e2e
/skip-conformance
/skip-networkpolicy

@tnqn
Copy link
Member

tnqn commented Feb 22, 2024

/test-windows-containerd-e2e

@antoninbas
Copy link
Contributor

I will merge this PR. The Windows job failure is unrelated to this PR: it is because sleep is not available in the Windows toolbox image (there is no need to sleep anyway, pause is the default command). I'm working on addressing the issue separately.

Thanks for your contribution @shikharish

@antoninbas antoninbas merged commit cb52631 into antrea-io:main Feb 22, 2024
54 of 59 checks passed
@shikharish
Copy link
Contributor Author

@antoninbas Thank you!

I would love to work on that issue too if it is possible. I am still exploring Antrea and want to contribute more to improve my understanding.

@antoninbas
Copy link
Contributor

@shikharish We appreciate your enthusiasm. But the Windows issue is a bit tricky and I have already started working on it. I'd like to restore the Windows CI ASAP.
If you want to look at a new issue, you may be interested in #5635. But it may be a bit time consuming.

@shikharish shikharish deleted the remove-busybox branch March 10, 2024 13:12
XinShuYang pushed a commit to XinShuYang/antrea that referenced this pull request Sep 29, 2024
XinShuYang pushed a commit to XinShuYang/antrea that referenced this pull request Sep 29, 2024
tnqn pushed a commit that referenced this pull request Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove all references to busybox in e2e tests
4 participants