-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix that Antrea L7NetworkPolicies do not handle Service traffic correctly #6941
base: main
Are you sure you want to change the base?
Conversation
a9d187b
to
ef99eab
Compare
ef99eab
to
458b1f0
Compare
458b1f0
to
9ebbfe0
Compare
l7c.once.Do(func() { | ||
l7c.l7Reconciler.StartSuricata() | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how this is supposed to work. The L7FlowExporter
should not be responsible for ensuring StartSuricata
is only called once. StartSuricata
may be called by other components that depend on the L7Reconciler
. As it is, it seems that this change may cause StartSuricata
to be called more than once. Why don't we keep the existing logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You just reminded me. These two features L7FlowExporter
and L7NetworkPolicy
can be enabled together. I have restored the change.
@@ -260,9 +264,15 @@ func convertProtocolTLS(tls *v1beta.TLSProtocol) string { | |||
return strings.Join(keywords, " ") | |||
} | |||
|
|||
func (r *Reconciler) StartSuricataOnce() { | |||
func (r *Reconciler) ensureL7FlowsAndStartSuricataOnce() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't seem like a good idea to mix both in the same function, which is also related to my comment above.
Could we have StartSuricataOnce
and ensureL7Flows
as 2 separate functions, each with possibly their own sync.Once
variable if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I added another sync.Once
to ensure the L7 flows to be installed once.
pkg/agent/openflow/client.go
Outdated
c.replayMutex.RLock() | ||
defer c.replayMutex.RUnlock() | ||
|
||
cacheKey := "l7_np_provision_flows" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "provision" adds any meaning here, maybe just "l7_np_flows"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. There is no relevant provision
in the function.
@@ -284,49 +284,6 @@ func TestBuildPipeline(t *testing.T) { | |||
}, | |||
}, | |||
}, | |||
{ | |||
name: "K8s Node, IPv4 only, with L7NetworkPolicy enabled", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really recall how these tests work, but can you explain why this is being removed?
(I assume this is because we are removing f.enableL7NetworkPolicy
and that flows are only installed after the first L7NP is applied locally?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your assumption is totally correct. The sub-test is used to verify the flows initialized after Antrea Agent started. Since the flows related to L7NP are only installed after the first L7NP is applied locally, there is no reason to reserve the test.
@@ -2152,9 +2158,6 @@ func (f *featureNetworkPolicy) initFlows() []*openflow15.FlowMod { | |||
var flows []binding.Flow | |||
if f.nodeType == config.K8sNode { | |||
flows = append(flows, f.ingressClassifierFlows()...) | |||
if f.enableL7NetworkPolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is f.enableL7NetworkPolicy
used anywhere anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option is used to initialize the TrafficControlTable
, where a flow for L7 Network Policies (L7NPs) will be installed when the first L7NP is installed locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See
antrea/pkg/agent/openflow/framework.go
Line 228 in f8d2ead
if f.enableL7NetworkPolicy { |
pkg/agent/openflow/network_policy.go
Outdated
@@ -2213,37 +2216,99 @@ func (f *featureNetworkPolicy) skipPolicyRuleCheckFlows() []binding.Flow { | |||
func (f *featureNetworkPolicy) l7NPTrafficControlFlows() []binding.Flow { | |||
cookieID := f.cookieAllocator.Request(f.category).Raw() | |||
vlanMask := uint16(openflow15.OFPVID_PRESENT) | |||
return []binding.Flow{ | |||
flows := []binding.Flow{ | |||
// This generates the flow to output the packets returned from an application-aware engine or a TrafficControl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment mentions "application-aware engine or a TrafficControl". This seems correct to me. It feels like the issue that this PR is trying to solve is not specific to L7NetworkPolicy but also applies to the TrafficControl feature. Even for the TrafficControl case, I want the TC destination to observe traffic consistently in both directions (i.e., using the Endpoint IP not the Service IP).
However, despite the comment, this function is specific to the L7NP feature (l7NPTrafficControlFlows
). Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leveraged the reg mark FromTCReturnRegMark
to identify the traffic returned from Suricata. This is not a good practice right now since the flows are dedicated to L7NP. How about adding a new reg mark FromL7NPReturnRegMark
to identify the returned traffic of L7NP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point I was trying to confirm is whether the same issue exists for the TrafficControl feature. It feels to me that the answer is yes, and that we can have the same kind of traffic asymmetry. So unless I am mistaken, should we make sure that this patch addresses both cases (L7NP / TC)? Hopefully, we can use the "same" mechanism / flows to address the TC case, based on your current design. cc @tnqn
|
||
expectedFlows := []string{ | ||
"cookie=0x1020000000000, table=Classifier, priority=200,in_port=11,vlan_tci=0x1000/0x1000 actions=pop_vlan,set_field:0x6/0xf->reg0,goto_table:UnSNAT", | ||
"cookie=0x1020000000000, table=ConntrackZone, priority=212,ip,reg0=0x0/0x800000 actions=set_field:0x800000/0x800000->reg0,ct(table=ConntrackZone,zone=65520)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you said this would affect all packets, causing a performance degradation, which is why we only install the flow if at least one L7NP is applied locally on the Node. It feels like we could go further, and only apply this flow if the source / destination IP matches a Pod to which a L7NP is actually applied? I am not saying that this is necessarily a good idea, as it would make the implementation more complex, but I wanted to check my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea! If so, we can reduce the performance degradation to only the packets from/to the Pods enforced by the L7NP, rather than all packets. This would make the implementation a little more complex. At least, I think we need a conjunction here. Could we enhance the implementation in another PR? @antoninbas cc @tnqn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see what Quan thinks. I think we can consider it as a possible future enhancement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with Hongliang offline, the proposal makes sense to me, and there seems a way that is not too complex. Addressing it in a separate PR sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After an offline discussion with @tnqn, I made a draft desgin.
- Add a new table
PacketClassifier
(while this table is only used by L7NetworkPolicy, a general name is chosen to accommodate potential use by other features in the future). By default, there is only one default flow:
table=PacketClassifier, priority=0 actions=goto_table:Classifier
- Add flows to
PacketClassifier
when an L7NetworkPolicy is appied to Pods. The flows match the packets to/from the Pods enforced by L7NetworkPolices. For example, if the IPs are 10.10.0.2, 10.10.0.3, the flows added to PacketClassifier would look like below. Action0xa00000/0xa00000->reg0
is to identify the packets.
table=PacketClassifier, priority=200, ip, nw_dst=10.10.0.3 actions=set_field:0xa00000/0xa00000->reg0,goto_table:Classifier
table=PacketClassifier, priority=200, ip, nw_src=10.10.0.3 actions=set_field:0xa00000/0xa00000->reg0,goto_table:Classifier
table=PacketClassifier, priority=200, ip, nw_dst=10.10.0.2 actions=set_field:0xa00000/0xa00000->reg0,goto_table:Classifier
table=PacketClassifier, priority=200, ip, nw_src=10.10.0.2 actions=set_field:0xa00000/0xa00000->reg0,goto_table:Classifier
table=PacketClassifier, priority=0 actions=goto_table:Classifier
- The flows added for L7NetworkPolicy in ConntrackZone should only match the packets with
reg0=0xa00000/0xa00000
loaded in above flows. The flows would look like this:
"table=ConntrackZone, priority=212,ip,reg0=0x0/0x800000,reg0=0xa00000/0xa00000 actions=set_field:0x800000/0x800000->reg0,ct(table=ConntrackZone,zone=65520)",
"table=ConntrackZone, priority=211,ct_state=+rpl+trk,ip,reg0=0x7/0xf,reg0=0xa00000/0xa00000 actions=ct(table=L3Forwarding,zone=65520,nat)",
"table=ConntrackZone, priority=211,ct_state=-rpl+trk,ip,reg0=0x7/0xf,reg0=0xa00000/0xa00000 actions=goto_table:L3Forwarding",
"table=ConntrackZone, priority=210,ct_state=+rpl+trk,ct_mark=0x80/0x80,ip,reg0=0xa00000/0xa00000 actions=goto_table:Output",
"table=ConntrackZone, priority=210,ct_state=-rpl+trk,ct_mark=0x80/0x80,ip,reg0=0xa00000/0xa00000 actions=ct(table=ConntrackState,zone=65520,nat)",
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why I put the new table before Classifier
is that the target actions of flows in Classifier
are not all the same. See the flows below. The packets to/from the Pods enforced by L7NetworkPolicies can be from container port, tunnel or antrea-gw0, and their target tables are not the same.
cookie=0x13010000000000, table=Classifier, priority=210,ip,in_port="antrea-gw0",nw_src=10.10.0.1 actions=load:0x2->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[28],resubmit(,SpoofGuard)
cookie=0x13010000000000, table=Classifier, priority=200,in_port="antrea-gw0" actions=load:0x2->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[27],resubmit(,SpoofGuard)
cookie=0x13010000000000, table=Classifier, priority=200,in_port="antrea-tun0" actions=load:0x1->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG0[9],resubmit(,UnSNAT)
cookie=0x13010000000000, table=Classifier, priority=190,in_port="coredns--4b0fc8" actions=load:0x3->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[28],resubmit(,SpoofGuard)
cookie=0x13010000000000, table=Classifier, priority=190,in_port="coredns--41da64" actions=load:0x3->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[28],resubmit(,SpoofGuard)
cookie=0x13010000000000, table=Classifier, priority=190,in_port="nginx-de-25537f" actions=load:0x3->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[28],resubmit(,SpoofGuard)
cookie=0x13010000000000, table=Classifier, priority=190,in_port="agnhost--dccaf7" actions=load:0x3->NXM_NX_REG0[0..3],load:0x1->NXM_NX_REG4[28],resubmit(,SpoofGuard)
cookie=0x13020000000000, table=Classifier, priority=200,in_port="antrea-l7-tap1",vlan_tci=0x1000/0x1000 actions=strip_vlan,load:0x6->NXM_NX_REG0[0..3],resubmit(,UnSNAT)
cookie=0x13000000000000, table=Classifier, priority=0 actions=drop
cb28778
to
2e30d9c
Compare
Signed-off-by: Hongliang Liu <[email protected]>
2e30d9c
to
d65d44d
Compare
Fix #6854
Will add another PR to update the changes of flows to the pipeline documentation later.