AUTH-541: OIDC structured auth config #713

liouk · 2024-10-08T12:50:56Z

This PR adds a controller behind the ExternalOIDC feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:

creates a structured auth config object based on the auth CR and validates it
serializes it into JSON and stores it into a configmap
syncs that configmap into openshift-config-managed, where it will be picked up by the KAS-o and synced into a static file and passed on to the KAS pods

KAS-o functionality PR: openshift/cluster-kube-apiserver-operator#1760

Enhancement: openshift/enhancements#1632

openshift-ci-robot · 2024-10-08T12:51:00Z

@liouk: This pull request references AUTH-541 which is a valid jira issue.

In response to this:

This PR adds a controller behind the ExternalOIDC feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:

creates a structured auth config object based on the auth CR

serializes it into a configmap

syncs that configmap into openshift-config, where it will be picked up by the KAS-o and synced into a static file (not yet implemented)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2024-10-22T10:28:29Z

@liouk: This pull request references AUTH-541 which is a valid jira issue.

In response to this:

This PR adds a controller behind the ExternalOIDC feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:

creates a structured auth config object based on the auth CR and validates it

serializes it into a configmap

syncs that configmap into openshift-config-managed, where it will be picked up by the KAS-o and synced into a static file (not yet implemented) and passed on to the KAS pods

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2024-10-28T09:24:26Z

@liouk: This pull request references AUTH-541 which is a valid jira issue.

In response to this:

This PR adds a controller behind the ExternalOIDC feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:

creates a structured auth config object based on the auth CR and validates it

serializes it into JSON and stores it into a configmap

syncs that configmap into openshift-config-managed, where it will be picked up by the KAS-o and synced into a static file (not yet implemented) and passed on to the KAS pods

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2024-11-19T17:09:37Z

New changes are detected. LGTM label has been removed.

benluddy · 2024-11-20T15:20:01Z

pkg/operator/starter.go

+	}
+
+	if !featureGates.Enabled(features.FeatureGateExternalOIDC) {
+		return nil, nil, nil


Do you want to set up a child context with cancellation so that the feature gate accessor that is launched above can terminate?

Don't we want to keep the accessor running so that it can os.Exit if the feature gates change?

You're right, I saw that nothing else was using it and did not make the connection that the comment at the top was saying that this all depends on the exit-on-feature-change behavior.

benluddy · 2024-11-20T15:22:40Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	if err != nil {
+		return fmt.Errorf("could not marshal auth config into JSON: %v", err)
+	}
+	authConfigJSON := strings.TrimSpace(string(encoded))


I know the JSON serializer appends a trailing newline, but what problem was that causing?

benluddy · 2024-11-20T15:24:28Z

pkg/controllers/externaloidc/externaloidc_controller.go

+
+var (
+	cfgScheme         = runtime.NewScheme()
+	codecs            = serializer.NewCodecFactory(cfgScheme, serializer.EnableStrict)


Nit: The EnableStrict option isn't really doing anything for you since you are only building an encoder from this.

benluddy · 2024-11-20T15:26:57Z

pkg/controllers/externaloidc/externaloidc_controller.go

+var (
+	cfgScheme         = runtime.NewScheme()
+	codecs            = serializer.NewCodecFactory(cfgScheme, serializer.EnableStrict)
+	serializerInfo, _ = runtime.SerializerInfoForMediaType(codecs.SupportedMediaTypes(), runtime.ContentTypeJSON)


Nit: Best to check the second return value and panic on false in case this should somehow break. That would be a clearer failure than a panic later inside of runtime.Encode.

benluddy · 2024-11-20T15:39:45Z

pkg/controllers/externaloidc/externaloidc_controller.go

+		return err
+	}
+
+	encoded, err := runtime.Encode(codecs.EncoderForVersion(serializerInfo.Serializer, apiserverv1beta1.ConfigSchemeGroupVersion), authConfig)


I think this is fine, but I'm also unsure what benefit setting up and using a CodecFactory is providing. We're not converting or defaulting anything and we always want it to use JSON. May as well use https://pkg.go.dev/k8s.io/apimachinery/pkg/util/json#Marshal?

Yes, this might be my fault. I mentioned that I am not sure if using the api machinery is better / more idiomatic than json.Marshal.

@liouk's original solution was based on json.Marshal.

The main benefit was to serialize the data in a more structured way, e.g. without having to define the type meta manually. But it gets more complicated than what I thought, so I will revert back to using json.Marshal.

benluddy · 2024-11-20T15:42:37Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	}
+
+	cm := corev1ac.ConfigMap(targetAuthConfigCMName, managedNamespace).WithData(map[string]string{authConfigDataKey: authConfigJSON})
+	if _, err := c.configMaps.ConfigMaps(managedNamespace).Apply(ctx, cm, metav1.ApplyOptions{FieldManager: c.name}); err != nil {


Is another field manager going to be writing to the same configmap? If not, it is probably reasonable to set Force: true to allow stomping conflicts.

No, the CAO must be the only one -- great point.

benluddy · 2024-11-20T15:44:29Z

pkg/controllers/externaloidc/externaloidc_controller.go

+		return fmt.Errorf("auth config validation failed: %v", errList)
+	}
+
+	cm := corev1ac.ConfigMap(targetAuthConfigCMName, managedNamespace).WithData(map[string]string{authConfigDataKey: authConfigJSON})


We've been trying to reduce no-op Apply requests by extracting the current apply configuration from the local informer's cache with https://pkg.go.dev/k8s.io/client-go/applyconfigurations/core/v1#ExtractConfigMap and doing a apiequality.Semantic.DeepEqual between them first. I haven't heard of any issues arising from that approach yet, does it make sense to do it here to avoid making a write on every resync?

That's the intention of this check: https://github.com/openshift/cluster-authentication-operator/pull/713/files#diff-3c99f304cc2949488aa2fa2b8aea2d7e8ddb0c8baa42e66f128eacd7cdbda11aR135-R137

if existingCM != nil && existingCM.Data[authConfigDataKey] == authConfigJSON { return nil }

Do you think the DeepEqual() is preferable?

I think the DeepEqual would be preferable because it would make updates in the future easier.

As an example, if in the future we needed to add new data to the ConfigMap, we would need to add a new check to prevent the no-op apply and add a new entry in the applyconfiguration we are creating.

If we had a single place to update our desired apply configuration (maybe a function?) we could simply do the semantic equality check between our desired state and the extracted apply configuration to determine if we need to do the apply operation.

benluddy · 2024-11-20T15:47:41Z

pkg/controllers/externaloidc/externaloidc_controller.go

+			jwt.Issuer.CertificateAuthority = caData
+		}
+
+		switch provider.ClaimMappings.Username.PrefixPolicy {


What if a new valid option is added in the future? Can there be a period of time during an upgrade from N to N+1 where there is a cluster-authentication-operator at N and a CRD at N+1? I would include a default case to avoid all doubt.

benluddy · 2024-11-20T15:51:48Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	// TODO currently validations from k8s.io/apiserver/pkg/apis/apiserver/validation cannot be used here
+	// since they aren't defined for the beta type; once the feature goes out of beta, we should replace
+	// this func with the upstream validations (but keep CA cert validation)


They're defined for the internal type, would it be easier to convert to internal and use those? This seems to be how it is done for unserved APIs like the apiserver configuration files, I don't see any external-versioned validations there.

Also, what happens if there is a bug in the validation used by cluster-authentication-operator that causes the validation to be overly strict? Even if this changes to use the validations from k8s.io/apiserver, this can be out of sync with whatever a particular kube-apiserver was compiled against. This is a form of client-side validation, which we have been trying to move away from. Revisioned kube-apiserver rollouts should mitigate the risk of writing an invalid config here, right?

Alright, I see the point in keeping validations at the server-side, especially with revisioned rollouts being in place.

However, the KAS pods do not really validate the CA cert, if specified. If the CA cert is not the correct one, the KAS pods will log an error but will not crash, so the rollout will be completed correctly. Therefore I'm considering keeping the CA cert validation at the CAO side, but dropping the rest. What do you think?

Pushed a fixup 20a1f72 that demonstrates what I described above. Placing a hold until this gets squashed or dropped.

/hold

…or OIDC

xingxingxia · 2024-11-22T09:49:37Z

From test result perspective, based on good pre-merge test results in https://issues.redhat.com/browse/OCPBUGS-44592?focusedId=26134688&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-26134688 , adding below label:
/label qe-approved

openshift-ci-robot · 2024-11-22T09:49:45Z

@liouk: This pull request references AUTH-541 which is a valid jira issue.

In response to this:

This PR adds a controller behind the ExternalOIDC feature gate that tracks the auth CR, and when auth type is configured to be OIDC, it:

creates a structured auth config object based on the auth CR and validates it

serializes it into JSON and stores it into a configmap

syncs that configmap into openshift-config-managed, where it will be picked up by the KAS-o and synced into a static file and passed on to the KAS pods

KAS-o functionality PR: openshift/cluster-kube-apiserver-operator#1760

Enhancement: openshift/enhancements#1632

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

liouk · 2024-11-22T11:04:53Z

/retest

pkg/controllers/externaloidc/externaloidc_controller.go

pkg/controllers/externaloidc/externaloidc_controller_test.go

pkg/controllers/externaloidc/externaloidc_controller.go

pkg/controllers/externaloidc/externaloidc_controller_test.go

…or OIDC Changes as per review from everettraven.

liouk · 2025-01-28T12:33:53Z

Note to self: commit 57c8bbc needs to be split, and controller functionality to be merged back on f8f5973.

openshift-ci · 2025-01-28T12:35:02Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ibihim, liouk
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…or OIDC Changes as per review from everettraven.

everettraven

Overall looks good. Have a handful of new comments

everettraven · 2025-01-30T15:21:40Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	}
+
+	return nil
+


Nit: unnecessary newline

everettraven · 2025-01-30T15:32:50Z

pkg/controllers/externaloidc/externaloidc_controller.go

+		case configv1.NoPrefix:
+			jwt.ClaimMappings.Username.Prefix = ptr.To("-")


Should the prefix actually be set to "-" in this case? Looking at https://github.com/kubernetes/apiserver/blob/c09fadd4305dde0b8861d8a7595690800a4a0db0/pkg/apis/apiserver/v1beta1/types.go#L332-L333 it seems like using the flags, "-" signalled no prefix but for equivalent mapping in the structured authentication config it should be ""?

This is actually a great find. You're right -- we must translate NoPrefix to "". But that's not the only issue; we also need to translate NoOpinion to a special prefix logic.

From here:

--oidc-username-prefix="" and --oidc-username-claim != "email", prefix was "<value of --oidc-issuer-url>#". For the same behavior using authentication config, set username.prefix="<value of issuer.url>#"

Therefore, NoOpinion means:

set username.prefix = "", if username.claim == "email"

set username.prefix = "<issuer.URL>#" otherwise

This has been accounted for in our authentication config API (and here).

This seems to be an issue in hypershift as well; will look further into that.

Pushed a fix for this in 5fecaf6.

everettraven · 2025-01-30T15:42:07Z

pkg/controllers/externaloidc/externaloidc_controller.go

+			},
+		}
+
+		if len(provider.Issuer.Audiences) > 0 {


Is this if block necessary?

It looks like the jwt.Issuer.Audiences is required , and the provider.Issuer.Audiences is required with a minimum items of 1 so it should be safe to avoid this conditional and just run the logic that is in the conditional block.

Since there's a for loop that appends the strings, it's not needed anyways. Will remove!

everettraven · 2025-01-30T15:49:35Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	if err := validateAuthConfig(*authConfig); err != nil {
+		return fmt.Errorf("auth config validation failed: %v", err)
+	}


Should this be done prior to generating the ApplyConfiguration?

As discussed offline, we decided to keep this sequence due to the fact that the validation performs a network call to the provider to grab the server certificates, which we can avoid if the two configs are equivalent.

everettraven · 2025-01-30T15:54:49Z

pkg/controllers/externaloidc/externaloidc_controller.go

+	existingCM, err := c.configMapLister.ConfigMaps(managedNamespace).Get(targetAuthConfigCMName)
+	if err != nil && !apierrors.IsNotFound(err) {
+		return nil, fmt.Errorf("could not retrieve auth configmap %s/%s to check data before sync: %v", managedNamespace, targetAuthConfigCMName, err)
+	}
+
+	if existingCM != nil {
+		existingCMApplyConfig, err := corev1ac.ExtractConfigMap(existingCM, c.name)
+		if err != nil {
+			return nil, fmt.Errorf("could not extract ConfigMap apply configuration: %v", err)
+		}
+
+		if equality.Semantic.DeepEqual(existingCMApplyConfig.Data, expectedCMApplyConfig.Data) {
+			return nil, nil
+		}
+	}


Including the fetching of the existing ConfigMap and the equality check in getExpectedApplyConfig feels like it overloads this method a bit.

What do you think of separating the fetching of the existing ApplyConfiguration into a separate method and putting the equality check in the sync method?

Done in e8dfce7

…or OIDC Changes as per review from everettraven.

everettraven · 2025-01-31T13:53:33Z

pkg/controllers/externaloidc/externaloidc_controller.go

+
+		case configv1.NoOpinion:
+			prefix := ""
+			if provider.ClaimMappings.Username.Claim == "email" {


If I understood #713 (comment) correctly, this should be set to provider.Issuer.URL + # if the username claim is not email:

Suggested change

if provider.ClaimMappings.Username.Claim == "email" {

if provider.ClaimMappings.Username.Claim != "email" {

openshift-ci · 2025-01-31T17:00:01Z

@liouk: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/test-operator-integration	`5fecaf6`	link	false	`/test test-operator-integration`
ci/prow/unit	`5fecaf6`	link	true	`/test unit`
ci/prow/okd-scos-e2e-aws-ovn	`5fecaf6`	link	false	`/test okd-scos-e2e-aws-ovn`
ci/prow/verify	`5fecaf6`	link	true	`/test verify`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 8, 2024

openshift-ci bot requested review from deads2k and ibihim October 8, 2024 12:52

liouk changed the title ~~AUTH-541: OIDC structured auth config~~ WIP: AUTH-541: OIDC structured auth config Oct 8, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 8, 2024

liouk force-pushed the oidc-config-structured-auth branch from 35b2d3d to 7e8ad90 Compare October 8, 2024 13:31

liouk changed the title ~~WIP: AUTH-541: OIDC structured auth config~~ AUTH-541: OIDC structured auth config Oct 8, 2024

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 8, 2024

liouk force-pushed the oidc-config-structured-auth branch from 7e8ad90 to 31e7cc5 Compare October 10, 2024 13:13

liouk changed the title ~~AUTH-541: OIDC structured auth config~~ WIP: AUTH-541: OIDC structured auth config Oct 10, 2024

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 10, 2024

liouk force-pushed the oidc-config-structured-auth branch 5 times, most recently from f066dae to c4f822c Compare October 17, 2024 13:00

openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 18, 2024

liouk force-pushed the oidc-config-structured-auth branch 2 times, most recently from 8ca7a87 to 36db406 Compare October 22, 2024 10:24

openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 22, 2024

liouk force-pushed the oidc-config-structured-auth branch 5 times, most recently from 81af1fc to 46663cd Compare October 28, 2024 09:06

liouk changed the title ~~WIP: AUTH-541: OIDC structured auth config~~ AUTH-541: OIDC structured auth config Oct 28, 2024

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 28, 2024

liouk force-pushed the oidc-config-structured-auth branch from bca8702 to 5457a34 Compare November 19, 2024 17:09

openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 19, 2024

liouk force-pushed the oidc-config-structured-auth branch from 5457a34 to a11dfa2 Compare November 19, 2024 17:55

benluddy reviewed Nov 20, 2024

View reviewed changes

operator: start externaloidc controller behind a featuregates accessor

57c8bbc

liouk force-pushed the oidc-config-structured-auth branch from a11dfa2 to 20a1f72 Compare November 21, 2024 14:08

openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 21, 2024

fixup! controllers: add controller to render structured auth config f…

08155bf

…or OIDC

liouk force-pushed the oidc-config-structured-auth branch from 20a1f72 to 08155bf Compare November 21, 2024 14:16

openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Nov 22, 2024

liouk mentioned this pull request Nov 22, 2024

WIP: AUTH-543: OIDC/OAuth resource configuration #740

Open

everettraven reviewed Jan 27, 2025

View reviewed changes

pkg/controllers/externaloidc/externaloidc_controller.go Outdated Show resolved Hide resolved

everettraven reviewed Jan 27, 2025

View reviewed changes

pkg/controllers/externaloidc/externaloidc_controller.go Outdated Show resolved Hide resolved

everettraven reviewed Jan 27, 2025

View reviewed changes

pkg/controllers/externaloidc/externaloidc_controller_test.go Outdated Show resolved Hide resolved

everettraven reviewed Jan 27, 2025

View reviewed changes

pkg/controllers/externaloidc/externaloidc_controller.go Show resolved Hide resolved

everettraven reviewed Jan 27, 2025

View reviewed changes

pkg/controllers/externaloidc/externaloidc_controller_test.go Show resolved Hide resolved

fixup! controllers: add controller to render structured auth config f…

ad9dd27

…or OIDC Changes as per review from everettraven.

liouk added 2 commits January 29, 2025 10:39

fixup! controllers: add controller to render structured auth config f…

d51fcd3

…or OIDC Changes as per review from everettraven.

fixup! controllers: add controller to render structured auth config f…

4a8b44d

…or OIDC Changes as per review from everettraven.

everettraven reviewed Jan 30, 2025

View reviewed changes

liouk added 2 commits January 31, 2025 13:37

fixup! controllers: add controller to render structured auth config f…

e8dfce7

…or OIDC Changes as per review from everettraven.

fixup! controllers: add controller to render structured auth config f…

5fecaf6

…or OIDC Changes as per review from everettraven.

everettraven reviewed Jan 31, 2025

View reviewed changes

		case configv1.NoPrefix:
		jwt.ClaimMappings.Username.Prefix = ptr.To("-")

	if provider.ClaimMappings.Username.Claim == "email" {
	if provider.ClaimMappings.Username.Claim != "email" {

AUTH-541: OIDC structured auth config #713

Are you sure you want to change the base?

AUTH-541: OIDC structured auth config #713

Conversation

liouk commented Oct 8, 2024 • edited Loading

openshift-ci-robot commented Oct 8, 2024 • edited by openshift-ci bot Loading

openshift-ci-robot commented Oct 22, 2024 • edited by openshift-ci bot Loading

openshift-ci-robot commented Oct 28, 2024 • edited by openshift-ci bot Loading

openshift-ci bot commented Nov 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xingxingxia commented Nov 22, 2024

openshift-ci-robot commented Nov 22, 2024 • edited by openshift-ci bot Loading

liouk commented Nov 22, 2024

liouk commented Jan 28, 2025

openshift-ci bot commented Jan 28, 2025

everettraven left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liouk Jan 31, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Jan 31, 2025

liouk commented Oct 8, 2024 •

edited

Loading

openshift-ci-robot commented Oct 8, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 22, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Oct 28, 2024 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Nov 22, 2024 •

edited by openshift-ci bot

Loading

liouk Jan 31, 2025 •

edited

Loading