[ML] Adding dynamic filtering for EIS configuration #120235

jonathan-buttner · 2025-01-15T20:15:38Z

WIP

This PR adds the ability to determine which models and task types will be supported by the cluster at the node bootup time.

This is my suggestion for the format of the response from the gateway:

GET /allowed-models
{
  "allowed-models": [
    {
      "model-name": "model-a",
      "task-types": ["text_embedding", "chat_completion"]
    },
    ...
  ]
}

My reasoning for a list instead of a single entry is that openai's gpt4-o supports completions and image generation which I'm guess would be two separate task types for us in the future. So best to allow multiple entries here.

jonathan-buttner · 2025-01-15T20:16:06Z

server/src/main/java/org/elasticsearch/inference/InferenceService.java

@@ -78,8 +78,8 @@ default void init(Client client) {}
     * Whether this service should be hidden from the API. Should be used for services
     * that are not ready to be used.
     */
-    default Boolean hideFromConfigurationApi() {
-        return Boolean.FALSE;
+    default boolean hideFromConfigurationApi() {


Some refactoring, I think we can use a primitive here since I don't believe we'll ever want to return null.

jonathan-buttner · 2025-01-15T20:17:27Z

server/src/main/java/org/elasticsearch/inference/InferenceServiceConfiguration.java

-            List<String> taskTypes = (ArrayList<String>) args[2];
-            return new InferenceServiceConfiguration.Builder().setService((String) args[0])
-                .setName((String) args[1])
-                .setTaskTypes(EnumSet.copyOf(taskTypes.stream().map(TaskType::fromString).collect(Collectors.toList())))


copyOf throws if it receives an empty set so I modified this to allow an empty set via the builder. An empty set should be unlikely in production because we shouldn't be getting the configuration at all if no task types are supported but it helps the tests.

jonathan-buttner · 2025-01-15T20:18:23Z

x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferencePlugin.java

@@ -274,7 +277,12 @@ public Collection<?> createComponents(PluginServices services) {

            ElasticInferenceServiceSettings inferenceServiceSettings = new ElasticInferenceServiceSettings(settings);
            String elasticInferenceUrl = this.getElasticInferenceServiceUrl(inferenceServiceSettings);
-            elasticInferenceServiceComponents.set(new ElasticInferenceServiceComponents(elasticInferenceUrl));
+            elasticInferenceServiceComponents.set(


@brendan-jugan-elastic this is where we'll need the logic to retrieve the actual enabled models and task types from the EIS gateway.

jonathan-buttner · 2025-01-15T20:18:52Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

+        var enabledStreamingTaskTypes = EnumSet.of(TaskType.COMPLETION);
+        enabledStreamingTaskTypes.retainAll(enabledTaskTypes);
+
+        if (enabledStreamingTaskTypes.isEmpty() == false) {


If there are no enabled task types we won't add any since we don't want to support anything.

jonathan-buttner · 2025-01-15T20:19:43Z

...rc/main/java/org/elasticsearch/xpack/inference/services/elastic/ElasticInferenceService.java

        }

-        private static final LazyInitializable<InferenceServiceConfiguration, RuntimeException> configuration = new LazyInitializable<>(
-            () -> {
+        private LazyInitializable<InferenceServiceConfiguration, RuntimeException> initConfiguration() {


Removing static here because this depends on a field initialized in the constructor.

…e-eis-acl

elasticsearchmachine · 2025-01-16T01:08:57Z

Pinging @elastic/ml-core (Team:ML)

…into inference-eis-acl

jonathan-buttner · 2025-01-17T22:49:32Z

...erence/src/main/java/org/elasticsearch/xpack/inference/external/http/sender/RequestTask.java

@@ -38,44 +30,13 @@ class RequestTask implements RejectableTask {
        ActionListener<InferenceServiceResults> listener
    ) {
        this.requestCreator = Objects.requireNonNull(requestCreator);
-        this.listener = getListener(Objects.requireNonNull(listener), timeout, Objects.requireNonNull(threadPool));
+        this.timedListener = new TimedListener<>(timeout, listener, threadPool);


I moved this into TimedListener so we could access it in the new send method of HttpRequestSender.

jonathan-buttner · 2025-01-17T22:51:06Z

...arch/xpack/inference/external/response/elastic/ElasticInferenceServiceAclResponseEntity.java

+        private static EnumSet<TaskType> toTaskTypes(List<String> stringTaskTypes) {
+            var taskTypes = EnumSet.noneOf(TaskType.class);
+            for (String taskType : stringTaskTypes) {
+                taskTypes.add(TaskType.fromStringOrStatusException(taskType));


TODO: If the task type is invalid we should ignore it, that could result in an empty task_types array. If that happens we should remove the model entry.

Functionality for filtering task types based on acl info for EIS

1ca8b32

jonathan-buttner added >refactoring :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v9.0.0 v8.18.0 labels Jan 15, 2025

jonathan-buttner commented Jan 15, 2025

View reviewed changes

jonathan-buttner added 3 commits January 15, 2025 15:25

Merge branch 'main' of github.com:elastic/elasticsearch into inferenc…

5c0b35d

…e-eis-acl

Fixing compile and test errors

f7d978f

updating with chat_completion

bbf6693

jonathan-buttner marked this pull request as ready for review January 16, 2025 01:08

jonathan-buttner and others added 4 commits January 17, 2025 15:05

Adding acl call

5bc83f2

[CI] Auto commit changes from spotless

8945b93

working run

75bbf96

Merge branch 'inference-eis-acl' of github.com:elastic/elasticsearch …

72ac2b0

…into inference-eis-acl

jonathan-buttner commented Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Adding dynamic filtering for EIS configuration #120235

[ML] Adding dynamic filtering for EIS configuration #120235

jonathan-buttner commented Jan 15, 2025 •

edited

Loading

jonathan-buttner Jan 15, 2025

jonathan-buttner Jan 15, 2025

jonathan-buttner Jan 15, 2025

jonathan-buttner Jan 15, 2025

jonathan-buttner Jan 15, 2025

elasticsearchmachine commented Jan 16, 2025

jonathan-buttner Jan 17, 2025

jonathan-buttner Jan 17, 2025 •

edited

Loading

[ML] Adding dynamic filtering for EIS configuration #120235

Are you sure you want to change the base?

[ML] Adding dynamic filtering for EIS configuration #120235

Conversation

jonathan-buttner commented Jan 15, 2025 • edited Loading

jonathan-buttner Jan 15, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 15, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 15, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 15, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 15, 2025

Choose a reason for hiding this comment

elasticsearchmachine commented Jan 16, 2025

jonathan-buttner Jan 17, 2025

Choose a reason for hiding this comment

jonathan-buttner Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

jonathan-buttner commented Jan 15, 2025 •

edited

Loading

jonathan-buttner Jan 17, 2025 •

edited

Loading