-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AUTH-541: OIDC structured auth config #713
base: master
Are you sure you want to change the base?
Conversation
@liouk: This pull request references AUTH-541 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
35b2d3d
to
7e8ad90
Compare
7e8ad90
to
31e7cc5
Compare
f066dae
to
c4f822c
Compare
8ca7a87
to
36db406
Compare
@liouk: This pull request references AUTH-541 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
81af1fc
to
46663cd
Compare
@liouk: This pull request references AUTH-541 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
bca8702
to
5457a34
Compare
New changes are detected. LGTM label has been removed. |
5457a34
to
a11dfa2
Compare
} | ||
|
||
if !featureGates.Enabled(features.FeatureGateExternalOIDC) { | ||
return nil, nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to set up a child context with cancellation so that the feature gate accessor that is launched above can terminate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want to keep the accessor running so that it can os.Exit if the feature gates change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I saw that nothing else was using it and did not make the connection that the comment at the top was saying that this all depends on the exit-on-feature-change behavior.
if err != nil { | ||
return fmt.Errorf("could not marshal auth config into JSON: %v", err) | ||
} | ||
authConfigJSON := strings.TrimSpace(string(encoded)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know the JSON serializer appends a trailing newline, but what problem was that causing?
|
||
var ( | ||
cfgScheme = runtime.NewScheme() | ||
codecs = serializer.NewCodecFactory(cfgScheme, serializer.EnableStrict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The EnableStrict option isn't really doing anything for you since you are only building an encoder from this.
var ( | ||
cfgScheme = runtime.NewScheme() | ||
codecs = serializer.NewCodecFactory(cfgScheme, serializer.EnableStrict) | ||
serializerInfo, _ = runtime.SerializerInfoForMediaType(codecs.SupportedMediaTypes(), runtime.ContentTypeJSON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Best to check the second return value and panic on false
in case this should somehow break. That would be a clearer failure than a panic later inside of runtime.Encode
.
return err | ||
} | ||
|
||
encoded, err := runtime.Encode(codecs.EncoderForVersion(serializerInfo.Serializer, apiserverv1beta1.ConfigSchemeGroupVersion), authConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine, but I'm also unsure what benefit setting up and using a CodecFactory is providing. We're not converting or defaulting anything and we always want it to use JSON. May as well use https://pkg.go.dev/k8s.io/apimachinery/pkg/util/json#Marshal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this might be my fault. I mentioned that I am not sure if using the api machinery is better / more idiomatic than json.Marshal.
@liouk's original solution was based on json.Marshal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main benefit was to serialize the data in a more structured way, e.g. without having to define the type meta manually. But it gets more complicated than what I thought, so I will revert back to using json.Marshal.
} | ||
|
||
cm := corev1ac.ConfigMap(targetAuthConfigCMName, managedNamespace).WithData(map[string]string{authConfigDataKey: authConfigJSON}) | ||
if _, err := c.configMaps.ConfigMaps(managedNamespace).Apply(ctx, cm, metav1.ApplyOptions{FieldManager: c.name}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is another field manager going to be writing to the same configmap? If not, it is probably reasonable to set Force: true
to allow stomping conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the CAO must be the only one -- great point.
return fmt.Errorf("auth config validation failed: %v", errList) | ||
} | ||
|
||
cm := corev1ac.ConfigMap(targetAuthConfigCMName, managedNamespace).WithData(map[string]string{authConfigDataKey: authConfigJSON}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been trying to reduce no-op Apply requests by extracting the current apply configuration from the local informer's cache with https://pkg.go.dev/k8s.io/client-go/applyconfigurations/core/v1#ExtractConfigMap and doing a apiequality.Semantic.DeepEqual
between them first. I haven't heard of any issues arising from that approach yet, does it make sense to do it here to avoid making a write on every resync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the intention of this check: https://github.com/openshift/cluster-authentication-operator/pull/713/files#diff-3c99f304cc2949488aa2fa2b8aea2d7e8ddb0c8baa42e66f128eacd7cdbda11aR135-R137
if existingCM != nil && existingCM.Data[authConfigDataKey] == authConfigJSON {
return nil
}
Do you think the DeepEqual()
is preferable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the DeepEqual
would be preferable because it would make updates in the future easier.
As an example, if in the future we needed to add new data to the ConfigMap, we would need to add a new check to prevent the no-op apply and add a new entry in the applyconfiguration we are creating.
If we had a single place to update our desired apply configuration (maybe a function?) we could simply do the semantic equality check between our desired state and the extracted apply configuration to determine if we need to do the apply operation.
jwt.Issuer.CertificateAuthority = caData | ||
} | ||
|
||
switch provider.ClaimMappings.Username.PrefixPolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if a new valid option is added in the future? Can there be a period of time during an upgrade from N to N+1 where there is a cluster-authentication-operator at N and a CRD at N+1? I would include a default case to avoid all doubt.
// TODO currently validations from k8s.io/apiserver/pkg/apis/apiserver/validation cannot be used here | ||
// since they aren't defined for the beta type; once the feature goes out of beta, we should replace | ||
// this func with the upstream validations (but keep CA cert validation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're defined for the internal type, would it be easier to convert to internal and use those? This seems to be how it is done for unserved APIs like the apiserver configuration files, I don't see any external-versioned validations there.
Also, what happens if there is a bug in the validation used by cluster-authentication-operator that causes the validation to be overly strict? Even if this changes to use the validations from k8s.io/apiserver, this can be out of sync with whatever a particular kube-apiserver was compiled against. This is a form of client-side validation, which we have been trying to move away from. Revisioned kube-apiserver rollouts should mitigate the risk of writing an invalid config here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I see the point in keeping validations at the server-side, especially with revisioned rollouts being in place.
However, the KAS pods do not really validate the CA cert, if specified. If the CA cert is not the correct one, the KAS pods will log an error but will not crash, so the rollout will be completed correctly. Therefore I'm considering keeping the CA cert validation at the CAO side, but dropping the rest. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a fixup 20a1f72 that demonstrates what I described above. Placing a hold until this gets squashed or dropped.
/hold
a11dfa2
to
20a1f72
Compare
20a1f72
to
08155bf
Compare
From test result perspective, based on good pre-merge test results in https://issues.redhat.com/browse/OCPBUGS-44592?focusedId=26134688&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-26134688 , adding below label: |
@liouk: This pull request references AUTH-541 which is a valid jira issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest |
…or OIDC Changes as per review from everettraven.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ibihim, liouk The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…or OIDC Changes as per review from everettraven.
…or OIDC Changes as per review from everettraven.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Have a handful of new comments
} | ||
|
||
return nil | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: unnecessary newline
case configv1.NoPrefix: | ||
jwt.ClaimMappings.Username.Prefix = ptr.To("-") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the prefix actually be set to "-"
in this case? Looking at https://github.com/kubernetes/apiserver/blob/c09fadd4305dde0b8861d8a7595690800a4a0db0/pkg/apis/apiserver/v1beta1/types.go#L332-L333 it seems like using the flags, "-"
signalled no prefix but for equivalent mapping in the structured authentication config it should be ""
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a great find. You're right -- we must translate NoPrefix
to ""
. But that's not the only issue; we also need to translate NoOpinion
to a special prefix logic.
From here:
--oidc-username-prefix=""
and--oidc-username-claim != "email"
, prefix was"<value of --oidc-issuer-url>#"
. For the same behavior using authentication config, setusername.prefix="<value of issuer.url>#"
Therefore, NoOpinion
means:
- set
username.prefix = ""
, ifusername.claim == "email"
- set
username.prefix = "<issuer.URL>#"
otherwise
This has been accounted for in our authentication config API (and here).
This seems to be an issue in hypershift as well; will look further into that.
Pushed a fix for this in 5fecaf6.
}, | ||
} | ||
|
||
if len(provider.Issuer.Audiences) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this if block necessary?
It looks like the jwt.Issuer.Audiences
is required , and the provider.Issuer.Audiences
is required with a minimum items of 1 so it should be safe to avoid this conditional and just run the logic that is in the conditional block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there's a for loop that appends the strings, it's not needed anyways. Will remove!
if err := validateAuthConfig(*authConfig); err != nil { | ||
return fmt.Errorf("auth config validation failed: %v", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be done prior to generating the ApplyConfiguration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, we decided to keep this sequence due to the fact that the validation performs a network call to the provider to grab the server certificates, which we can avoid if the two configs are equivalent.
existingCM, err := c.configMapLister.ConfigMaps(managedNamespace).Get(targetAuthConfigCMName) | ||
if err != nil && !apierrors.IsNotFound(err) { | ||
return nil, fmt.Errorf("could not retrieve auth configmap %s/%s to check data before sync: %v", managedNamespace, targetAuthConfigCMName, err) | ||
} | ||
|
||
if existingCM != nil { | ||
existingCMApplyConfig, err := corev1ac.ExtractConfigMap(existingCM, c.name) | ||
if err != nil { | ||
return nil, fmt.Errorf("could not extract ConfigMap apply configuration: %v", err) | ||
} | ||
|
||
if equality.Semantic.DeepEqual(existingCMApplyConfig.Data, expectedCMApplyConfig.Data) { | ||
return nil, nil | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Including the fetching of the existing ConfigMap and the equality check in getExpectedApplyConfig
feels like it overloads this method a bit.
What do you think of separating the fetching of the existing ApplyConfiguration into a separate method and putting the equality check in the sync
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in e8dfce7
…or OIDC Changes as per review from everettraven.
…or OIDC Changes as per review from everettraven.
|
||
case configv1.NoOpinion: | ||
prefix := "" | ||
if provider.ClaimMappings.Username.Claim == "email" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood #713 (comment) correctly, this should be set to provider.Issuer.URL
+ #
if the username claim is not email
:
if provider.ClaimMappings.Username.Claim == "email" { | |
if provider.ClaimMappings.Username.Claim != "email" { |
@liouk: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR adds a controller behind the
ExternalOIDC
feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:openshift-config-managed
, where it will be picked up by the KAS-o and synced into a static file and passed on to the KAS podsKAS-o functionality PR: openshift/cluster-kube-apiserver-operator#1760
Enhancement: openshift/enhancements#1632