Populate the status for direct connections and ensure Kafka and SR proxies use credentials in headers #169

rhauch · 2024-11-15T23:30:07Z

Resolves #123

Summary of Changes

Completes the functionality of direct connections, though we will add more integration tests as follow-ons. For example, the existing ITs use LocalTestEnvironment that runs containers without authN, and the recently-added ITs for direct connections use no credentials. We'll need another ConfluentPlatformTestEnvironment that runs CP clusters with several authN mechanisms enabled, and then we'll run additional tests against that.

Any additional details or context that should be provided?

This PR modifies the DirectConnectionState class to properly populate the status for direct connections, after verifying the ability to connect to Kafka using an AdminClient and to SR using an SR client.

It also fixes a few issues with how we compute the authentication-related headers using the credentials. These headers are used in the Kafka REST proxy and SR proxy implementations.

It adds some debug log messages to the ‘KafkaProducerClients’, KafkaConsumerFactory, SchemaRegistryClients and AdminClients beans, including logging at debug level the (redacted) configuration properties for these clients. To make it easier to log the redacted configurations, the ClientConfigurator class was changed to return a Configuration object rather than a Map<String, Object>. The Configuration object has one method to get the configuration properties as a map, while the toString() and other methods ensure that only the redacted form of the configuration is otherwise exposed.

Finally, fixed a bug in BasicCredentials and ApiKeyAndSecret classes in the httpClientHeaders() implementations. This method returns the authorization-related HTTP headers, and the base64-encoded value was not being created correctly. A few unit tests were added, using obviously-fake secret values.

Pull request checklist

Please check if your PR fulfills the following (if applicable):

Tests:
- Added new
- Updated existing
- Deleted existing
Have you validated this change locally against a running instance of the Quarkus dev server?
```
make quarkus-dev
```
Have you validated this change against a locally running native executable?
```
make mvn-package-native && ./target/ide-sidecar-*-runner
```

…oxies use credentials in headers Modifies the `DirectConnectionState` class to properly populate the status for direct connections, after verifying the ability to connect to Kafka using an AdminClient and to SR using an SR client. Also, fixed a few issues with how we compute the authentiation-related headers using the credentials. These headers are used in the Kafka REST proxy and SR proxy implementations. Some debug log messages were added to the ‘KafkaProducerClients’, `KafkaConsumerFactory`, `SchemaRegistryClients` and `AdminClients` beans, including logging at debug level the (redacted) configuration properties for these clients. To make it easier to log the redacted configurations, the `ClientConfigurator` class was changed to return a `Configuration` object rather than a `Map<String, Object>`. The `Configuration` object has one method to get the configuration properties as a map, while the `toString()` and other methods ensure that only the redacted form of the configuration is otherwise exposed. Finally, fixed a bug in `BasicCredentials` and `ApiKeyAndSecret` classes in the `httpClientHeaders()` implementations. This method returns the authorization-related HTTP headers, and the base64-encoded value was not being created correctly. A few unit tests were added, using obviously-fake secret values.

rhauch

A few comments to help reviewers.

rhauch · 2024-11-15T23:32:09Z

src/main/java/io/confluent/idesidecar/restapi/application/ReflectionConfiguration.java

+        ScramServerCallbackHandler.class,
+        // Schema Registry client classes that are not registered in
+        // https://github.com/quarkusio/quarkus/blob/3.16.3/extensions/schema-registry/confluent/common/deployment/src/main/java/io/quarkus/confluent/registry/common/ConfluentRegistryClientProcessor.java
+        Mode.class,
+        ExtendedSchema.class,
+        Rule.class,
+        RuleKind.class,
+        RuleMode.class,
+        RuleSet.class,
+        SchemaEntity.class,
+        SchemaTags.class,
+        SchemaRegistryServerVersion.class,


The Quarkus library for Confluent SR does not register all of the classes we will (or might) need at runtime, so we do this here. Most of these are related to data contracts, which were a more recent additional to SR.

rhauch · 2024-11-15T23:34:42Z

src/main/java/io/confluent/idesidecar/restapi/cache/AdminClients.java

+          var config = configurator.getAdminClientConfig(connectionId, clusterId);
+          Log.debugf(
+              "Creating schema registry client for connection %s and cluster %s with configuration:\n  %s",
+              connectionId,
+              clusterId,
+              config
+          );
          // Create the admin client
-          return AdminClient.create(config);
+          return AdminClient.create(config.asMap());


Changing the ClientConfigurator.get*(...) methods to return a Configuration object makes it much easier to get the configuration once and then use the redacted form in log message and pass the non-redacted form to the client.

rhauch · 2024-11-15T23:37:00Z

src/main/java/io/confluent/idesidecar/restapi/models/graph/ConfluentRestClient.java

-      String connectionId,
+      MultiMap headers,


The places where this method is used knows whether it's using the REST client for Kafka or SR. Since each type of cluster need different credentials and therefore authN-related headers, it's easier for the caller to compute the appropriate headers and simply pass them into this method.

rhauch · 2024-11-15T23:37:51Z

...java/io/confluent/idesidecar/restapi/proxy/clusters/processors/ClusterStrategyProcessor.java

-    return switch (connectionType) {
-      case CCLOUD -> clusterType == ClusterType.KAFKA
-          ? confluentCloudKafkaClusterStrategy : confluentCloudSchemaRegistryClusterStrategy;
-      case LOCAL ->
-          clusterType == ClusterType.KAFKA
-              ? confluentLocalKafkaClusterStrategy : confluentLocalSchemaRegistryClusterStrategy;
-      case DIRECT ->
-          clusterType == ClusterType.KAFKA
-          ? directKafkaClusterStrategy : directSchemaRegistryClusterStrategy;
-      case PLATFORM -> null;
+    return switch(clusterType) {
+      case KAFKA -> switch (connectionType) {
+        case CCLOUD -> confluentCloudKafkaClusterStrategy;
+        case LOCAL -> confluentLocalKafkaClusterStrategy;
+        case DIRECT -> directKafkaClusterStrategy;
+        case PLATFORM -> null;
+      };
+      case SCHEMA_REGISTRY -> switch (connectionType) {
+        case CCLOUD -> confluentCloudSchemaRegistryClusterStrategy;
+        case LOCAL -> confluentLocalSchemaRegistryClusterStrategy;
+        case DIRECT -> directSchemaRegistryClusterStrategy;
+        case PLATFORM -> null;
+      };


This just refactors the logic to be a bit easier to debug (and set breakpoints) and read.

rhauch · 2024-11-15T23:38:44Z

...onfluent/idesidecar/restapi/proxy/clusters/strategy/DirectSchemaRegistryClusterStrategy.java

+  /**
+   * Constructs the headers for the proxied request, and add the authentication headers from the
+   * credentials, and the `target-sr-cluster` header set to the connection's SR cluster ID.
+   * @param context the context of the proxy request
+   * @return the headers to be used in the proxy request to the Schema Registry
+   */
+  @Override
+  public MultiMap constructProxyHeaders(ClusterProxyContext context) {
+    var headers = super.constructProxyHeaders(context);
+    if (context.getConnectionState() instanceof DirectConnectionState directConnectionState) {
+      var srConfig = directConnectionState.getSpec().schemaRegistryConfig();
+      if (srConfig != null) {
+        var credentials = srConfig.credentials();
+        if (credentials != null) {
+          credentials.httpClientHeaders().ifPresent(map -> map.forEach(headers::add));
+        }
+      }
+    }
+    headers.add(TARGET_SR_CLUSTER_HEADER, context.getClusterId());
+
+    return headers;
+  }


This is pretty much the same as CCloudSchemaRegistryClusterStrategy but the credentials are accessed differently.

rhauch · 2024-11-15T23:40:23Z

src/main/java/io/confluent/idesidecar/restapi/credentials/ApiKeyAndSecret.java

-      var value = "%s:%s".formatted(key, secret.asCharArray());
+      var value = "%s:%s".formatted(key, secret.asString());


This bug is super subtle: the toString() of the char array is not the same as a String created from the char array. Added a unit test for this.

rhauch · 2024-11-15T23:41:15Z

src/main/java/io/confluent/idesidecar/restapi/credentials/Redactable.java

-    return new String(this.asCharArray());
+    return new String(raw);


The this.asCharArray() makes a copy of the raw array; then String makes another copy. This change avoids the extra copy.

flippingbits

Thanks, @rhauch! I have one question about the handling of timeouts when retrieving the health status of Direct Connections and a few very minor suggestions/questions. Otherwise, your PR looks good to me. 🎉

flippingbits · 2024-11-18T08:34:45Z

src/main/java/io/confluent/idesidecar/restapi/cache/ClientConfigurator.java

        .flatMap(creds -> creds.kafkaClientProperties(options))
        .ifPresent(props::putAll);

    // Add any auth properties for Schema Registry to the Kafka client config,
    // with the "schema.registry." prefix (unless the property already starts with that)
-    if (sr != null) {
-      var additional = getSchemaRegistryClientConfig(connection, sr, redact);
+    if (srUri != null) {


Should we also verify that srId is not null?

srId can be null at this point. I've clarified the JavaDoc of the method called below, and removed it from the two ConnectionState methods.

src/main/java/io/confluent/idesidecar/restapi/cache/ClientConfigurator.java

flippingbits · 2024-11-18T08:55:50Z

src/main/java/io/confluent/idesidecar/restapi/connections/DirectConnectionState.java

+              .state(ConnectedState.SUCCESS)
+              .build()
+      );
+    } catch (Exception e) {


Would it make sense to log the exception?

I was thinking no, because the message is passed to the status error, and because it is an expected situation if invalid credentials are called. (We don't want users thinking that because the sidecar logged an exception and stack trace that there is a bug within the sidecar causing the failure to connect.)

I could add it a debug log message, though users would never see that and it's not clear it would help us if trying to debug an auth failure. So I'm inclined to not add it.

I thought having access to the exception name and stack trace could be useful for debugging connectivity/auth issues. If the exception message is sufficient, we can keep it as it is. If we decide to log the exception, we should use at least the INFO level so that it shows up in the log, as you pointed out.

flippingbits · 2024-11-18T08:59:08Z

src/main/java/io/confluent/idesidecar/restapi/connections/DirectConnectionState.java

+              .builder()
+              .state(ConnectedState.FAILED)
+              .errors(
+                  new AuthErrors().withSignIn(


I'd suggest to put this error under authStatusCheck because it occurred when checking the status of the direct connection. WDYT?

Suggested change

new AuthErrors().withSignIn(

new AuthErrors().withAuthStatusCheck(

The whole point of #162 is to deprecate and move away from the status.authentication object, since it only applies to CCloud. The CCloud-related status is available under a nested status.ccloud object (structurally identical to status.authentication), while direct connection errors (incl auth-related errors) are recorded under status.kafka_cluster or status.schema_registry (depending upon where the error is).

flippingbits · 2024-11-18T09:01:25Z

src/main/java/io/confluent/idesidecar/restapi/connections/DirectConnectionState.java

+    // and describing the cluster.
+    try (var adminClient = createAdminClient(kafkaConfig)) {
+      var clusterDesc = adminClient.describeCluster();
+      var actualClusterId = clusterDesc.clusterId().get(5, TimeUnit.SECONDS);


Should we make such timeouts configurable via the application.yml or env vars? It would allow the extension to overwrite/change them.

I don't anticipate the extension needing to change this, but I can add a configuration. How about a ide-sidecar.connections.direct.timeout-seconds property?

flippingbits · 2024-11-18T09:02:55Z

src/main/java/io/confluent/idesidecar/restapi/connections/DirectConnectionState.java

+    }
+    // There is a Schema Registry configuration, so validate the connection by creating a
+    // SchemaRegistryClient and getting the mode.
+    try (var srClient = createSchemaRegistryClient(schemaRegistryConfig)) {


Do you know if createSchemaRegistryClient (or the constructor of CachedSchemaRegistryClient) uses a timeout when creating the SR client? If not, we'd probably wait for a long time when running into network issues so we want to set a timeout.

Alternatively, we could add timeouts to the futures in lines 84 and 85, and let them fail if they exceed the timeout. If we go down that road, we'd have to build the FAILED state in the method getConnectionStatus().

I'm leaning towards the latter option. What's your take on it?

The default timeout for all REST API calls made by the CachedSchemaRegistryClient is 30 seconds. It's configurable, so I can just reuse the same timeout value constant discussed above.

flippingbits · 2024-11-18T09:16:02Z

...onfluent/idesidecar/restapi/proxy/clusters/strategy/DirectSchemaRegistryClusterStrategy.java

+          credentials.httpClientHeaders().ifPresent(map -> map.forEach(headers::add));
+        }
+      }
+    }


Should we throw an exception or log an error if context.getConnectionState() is not a DirectConnectionState?

We're not doing that in ConfluentCloudSchemaRegistryClusterStrategy, and I expect that's the case because ClusterStrategyProcessor is always using it correctly and therefore it should never occur at runtime.

That means that throwing an exception will only help during development, but maybe that's reason enough.

Actually, if we want to add that, I think we should do that in a followup, since we'd also want to add it in multiple other places.

Doing it in a follow-up sounds good to me. I wanted to make sure that we don't swallow any errors.

rhauch · 2024-11-18T19:44:49Z

Thanks for the review, @flippingbits. I've incorporated your feedback -- would you please take a look?

flippingbits

Thanks for the update, @rhauch. Your PR looks good to me.

rhauch requested a review from a team as a code owner November 15, 2024 23:30

rhauch requested a review from a team November 15, 2024 23:30

rhauch self-assigned this Nov 15, 2024

rhauch added this to the Initial Confluent Platform Support milestone Nov 15, 2024

rhauch commented Nov 15, 2024

View reviewed changes

flippingbits reviewed Nov 18, 2024

View reviewed changes

Incorporate reviewer feedback

19ddc4c

flippingbits self-requested a review November 18, 2024 21:13

flippingbits approved these changes Nov 18, 2024

View reviewed changes

rhauch merged commit 1cdbd92 into main Nov 19, 2024
1 check passed

rhauch deleted the direct-connection-status branch November 19, 2024 00:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Populate the status for direct connections and ensure Kafka and SR proxies use credentials in headers #169

Populate the status for direct connections and ensure Kafka and SR proxies use credentials in headers #169

rhauch commented Nov 15, 2024

rhauch left a comment

rhauch Nov 15, 2024

rhauch Nov 15, 2024

rhauch Nov 15, 2024

rhauch Nov 15, 2024

rhauch Nov 15, 2024

rhauch Nov 15, 2024

rhauch Nov 15, 2024

flippingbits left a comment

flippingbits Nov 18, 2024

rhauch Nov 18, 2024

flippingbits Nov 18, 2024

rhauch Nov 18, 2024 •

edited

Loading

flippingbits Nov 18, 2024

flippingbits Nov 18, 2024

rhauch Nov 18, 2024

flippingbits Nov 18, 2024

rhauch Nov 18, 2024

flippingbits Nov 18, 2024 •

edited

Loading

rhauch Nov 18, 2024

flippingbits Nov 18, 2024

rhauch Nov 18, 2024

rhauch Nov 18, 2024

flippingbits Nov 18, 2024

rhauch commented Nov 18, 2024

flippingbits left a comment

		var value = "%s:%s".formatted(key, secret.asCharArray());
		var value = "%s:%s".formatted(key, secret.asString());

		return new String(this.asCharArray());
		return new String(raw);

	new AuthErrors().withSignIn(
	new AuthErrors().withAuthStatusCheck(

Populate the status for direct connections and ensure Kafka and SR proxies use credentials in headers #169

Populate the status for direct connections and ensure Kafka and SR proxies use credentials in headers #169

Conversation

rhauch commented Nov 15, 2024

Summary of Changes

Any additional details or context that should be provided?

Pull request checklist

rhauch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flippingbits left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhauch Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flippingbits Nov 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rhauch commented Nov 18, 2024

flippingbits left a comment

Choose a reason for hiding this comment

rhauch Nov 18, 2024 •

edited

Loading

flippingbits Nov 18, 2024 •

edited

Loading