[ML] Inference Service Refactoring results format #102429

jonathan-buttner · 2023-11-21T18:26:54Z

Specifies the results format for the inference plugin's services. This should match the results proposed here: elastic/elasticsearch-specification#2329

Notable changes

Created a new interface InferenceServiceResults that the formats implement to give us some more freedom (rather than using the ml plugin's InferenceResults
Created a LegacyTextEmbeddingResults to represent the format of the openai response when it used the InferenceResults
Created a TextEmbeddingResults that adheres to the new InferenceServiceResults interface
Created a SparseEmbeddingResults that adheres to the new InferenceServiceResults interface
The elser and hugging face service use SparseEmbeddingResults, openai uses TextEmbeddingResults

LegacyTextEmbeddingResults

The LegacyTextEmbeddingResults is functionally identical to the new TextEmbeddingResults. The legacy one implements InferenceResults and the new one implements InferenceServiceResults. I wasn't sure if it made sense to separate them or simply have one class, say TextEmbeddingResults that implemented both InferenceResults and InferenceServiceResults. I'm open to either approach. I figured it might be more clear that we're moving away from InferenceResults if I had that class marked as deprecated.

Format

TextEmbeddingResults

{
  "text_embedding": [
    {
      "embedding": [
        0.1
      ]
    },
    {
      "embedding": [
        0.2
      ]
    }
  ]
}

SparseEmbeddingResults

{
  "sparse_embedding": {
    "is_truncated": false,
    "embedding": [
      {
        "token": 0.1
      }
    ]
  }
}

…-results-format

jonathan-buttner · 2023-11-21T18:39:54Z

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/InferenceAction.java

@@ -183,45 +188,99 @@ public Request build() {

    public static class Response extends ActionResponse implements ToXContentObject {

-        private final List<? extends InferenceResults> results;
+        private final InferenceServiceResults results;


The changes in this class need a thorough review 😬

jonathan-buttner · 2023-11-21T18:43:32Z

...ence/src/main/java/org/elasticsearch/xpack/inference/results/LegacyTextEmbeddingResults.java

+ * @deprecated use {@link TextEmbeddingResults} instead
+ */
+@Deprecated
+public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults {


I think we could fold this into the TextEmbeddingResults class by simply making it implement InferenceResults as well. I wasn't sure if that was less clear though or how that'd impact adding optional fields in the future. Or how it'd impact us if the InferenceResults interface changes in the future. I'm open to other ideas though.

It's good to keep this as a separate class rather than cluttering up TextEmbeddingResults

jonathan-buttner · 2023-11-21T18:45:15Z

...ence/src/main/java/org/elasticsearch/xpack/inference/results/LegacyTextEmbeddingResults.java

+ */
+@Deprecated
+public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults {
+    public static final String NAME = "text_embedding_results";


This is the name that was used in the openai PR here

jonathan-buttner · 2023-11-21T18:45:59Z

...nference/src/main/java/org/elasticsearch/xpack/inference/results/SparseEmbeddingResults.java

+
+        for (InferenceResults result : results) {
+            if (result instanceof TextExpansionResults expansionResults) {
+                isTruncated |= expansionResults.isTruncated();


If for some reason only 1 of the results is truncated we'll mark them all as truncated.

I thought the structure would be a list of objects each with is_truncated and embedding properties.

{ "sparse_embedding": [ { "is_truncated": false, "embedding": [ { "token": 0.1 }, ... ] }, { "is_truncated": true, "embedding": [ { "token": 2.0 }, ... ] } ] }

The matches the text_embedding result structure

jonathan-buttner · 2023-11-21T18:46:33Z

.../inference/src/main/java/org/elasticsearch/xpack/inference/results/TextEmbeddingResults.java

-    public static final String NAME = "text_embedding_results";
+public record TextEmbeddingResults(List<Embedding> embeddings) implements InferenceServiceResults {
+    // TODO: what should the name be here?
+    public static final String NAME = "text_embedding_results_v2";


Thoughts on what to use for the name?

Suggested change

public static final String NAME = "text_embedding_results_v2";

public static final String NAME = "text_embedding_service_results";

🤷

jonathan-buttner · 2023-11-21T18:49:05Z

...nce/src/test/java/org/elasticsearch/xpack/inference/action/InferenceActionResponseTests.java

@@ -35,7 +36,12 @@ protected Writeable.Reader<InferenceAction.Response> instanceReader() {

    @Override
    protected InferenceAction.Response createTestInstance() {
-        return new InferenceAction.Response(List.of(TextExpansionResultsTests.createRandomResults()));
+        var result = switch (randomIntBetween(0, 1)) {


Are there other tests that we should have to get coverage over that if-block in InferenceAction.Response?

The if-block can be tested explicitly if you use AbstractBWCWireSerializationTestCase as the test base class.

Implement the method mutateInstanceForVersion and you can simulate mixed version transport comms and assert on the expected output.

AbstractBWCWireSerializationTestCase is in xpack core and should be accessible here

elasticsearchmachine · 2023-11-21T19:31:22Z

Pinging @elastic/ml-core (Team:ML)

davidkyle · 2023-11-22T10:31:06Z

@elasticmachine test this please

davidkyle · 2023-11-22T10:38:12Z

@elasticmachine test this please

davidkyle · 2023-11-22T11:00:07Z

...nference/src/main/java/org/elasticsearch/xpack/inference/results/SparseEmbeddingResults.java

+
+        for (InferenceResults result : results) {
+            if (result instanceof TextExpansionResults expansionResults) {
+                isTruncated |= expansionResults.isTruncated();


I thought the structure would be a list of objects each with is_truncated and embedding properties.

{ "sparse_embedding": [ { "is_truncated": false, "embedding": [ { "token": 0.1 }, ... ] }, { "is_truncated": true, "embedding": [ { "token": 2.0 }, ... ] } ] }

The matches the text_embedding result structure

davidkyle · 2023-11-22T11:14:57Z

.../inference/src/main/java/org/elasticsearch/xpack/inference/results/TextEmbeddingResults.java

-    public static final String NAME = "text_embedding_results";
+public record TextEmbeddingResults(List<Embedding> embeddings) implements InferenceServiceResults {
+    // TODO: what should the name be here?
+    public static final String NAME = "text_embedding_results_v2";


Suggested change

public static final String NAME = "text_embedding_results_v2";

public static final String NAME = "text_embedding_service_results";

🤷

davidkyle · 2023-11-22T11:21:04Z

...nference/src/main/java/org/elasticsearch/xpack/inference/results/SparseEmbeddingResults.java

+        this(in.readCollectionAsList(Embedding::new), in.readBoolean());
+    }
+
+    public static SparseEmbeddingResults create(List<? extends InferenceResults> results) {


Suggested change

public static SparseEmbeddingResults create(List<? extends InferenceResults> results) {

public static SparseEmbeddingResults of(List<? extends InferenceResults> results) {

of is more idiomatic of Java

davidkyle · 2023-11-22T11:29:18Z

x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/results/TestUtils.java

+
+public class TestUtils {
+
+    public static String toJsonString(ToXContentFragment entity) throws IOException {


Strings.toString handles ToXContentFragment and pretty printing, is there a reason not to use that?

https://github.com/elastic/elasticsearch/blob/main/server/src/main/java/org/elasticsearch/common/Strings.java#L803

...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/InferenceAction.java

davidkyle · 2023-11-22T12:02:08Z

...nce/src/test/java/org/elasticsearch/xpack/inference/action/InferenceActionResponseTests.java

@@ -35,7 +36,12 @@ protected Writeable.Reader<InferenceAction.Response> instanceReader() {

    @Override
    protected InferenceAction.Response createTestInstance() {
-        return new InferenceAction.Response(List.of(TextExpansionResultsTests.createRandomResults()));
+        var result = switch (randomIntBetween(0, 1)) {


The if-block can be tested explicitly if you use AbstractBWCWireSerializationTestCase as the test base class.

Implement the method mutateInstanceForVersion and you can simulate mixed version transport comms and assert on the expected output.

AbstractBWCWireSerializationTestCase is in xpack core and should be accessible here

…-results-format

davidkyle

LGTM

davidkyle · 2023-11-28T11:16:17Z

...ence/src/main/java/org/elasticsearch/xpack/inference/results/LegacyTextEmbeddingResults.java

+ * @deprecated use {@link TextEmbeddingResults} instead
+ */
+@Deprecated
+public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults {


It's good to keep this as a separate class rather than cluttering up TextEmbeddingResults

davidkyle · 2023-11-28T11:21:49Z

...nference/src/main/java/org/elasticsearch/xpack/inference/results/SparseEmbeddingResults.java

+        // Map<String, Object> sparseEmbeddingMap = new LinkedHashMap<>();
+        // sparseEmbeddingMap.put(EMBEDDING, embeddingList);


Suggested change

// Map<String, Object> sparseEmbeddingMap = new LinkedHashMap<>();

// sparseEmbeddingMap.put(EMBEDDING, embeddingList);

Oops thanks.

davidkyle · 2023-11-28T11:23:08Z

@elasticmachine update branch

…-results-format

…/elasticsearch into ml-infer-results-format

* Adding results * Fixing merge issues * Understanding the complexity * Making progress on tests * Tests working * Some comments * More comments * Addressing pr feedback * Fixing test * Fixing test * Fixing up comments and dead code --------- Co-authored-by: Elastic Machine <[email protected]>

jonathan-buttner added 7 commits November 15, 2023 15:21

Adding results

1baa027

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

61082f2

…-results-format

Fixing merge issues

a1daeed

Understanding the complexity

04dff50

Making progress on tests

eb9b46f

Tests working

9682658

Some comments

27a8150

jonathan-buttner added >non-issue :ml Machine learning Team:ML Meta label for the ML team v8.12.0 labels Nov 21, 2023

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

efc61e4

…-results-format

jonathan-buttner commented Nov 21, 2023

View reviewed changes

More comments

aff330d

jonathan-buttner marked this pull request as ready for review November 21, 2023 19:30

droberts195 changed the title ~~[M] Inference Service Refactoring results format~~ [ML] Inference Service Refactoring results format Nov 21, 2023

jonathan-buttner requested a review from davidkyle November 21, 2023 19:38

davidkyle reviewed Nov 22, 2023

View reviewed changes

jonathan-buttner added 4 commits November 27, 2023 15:24

Addressing pr feedback

34f1504

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

ec6e4d7

…-results-format

Fixing test

6bd7caa

Fixing test

566eeca

davidkyle approved these changes Nov 28, 2023

View reviewed changes

elasticmachine and others added 3 commits November 28, 2023 22:23

Merge branch 'main' into ml-infer-results-format

35c055a

Fixing up comments and dead code

cf8cc71

Merge branch 'main' of github.com:elastic/elasticsearch into ml-infer…

36ed27f

…-results-format

Merge branch 'ml-infer-results-format' of github.com:jonathan-buttner…

92246dd

…/elasticsearch into ml-infer-results-format

jonathan-buttner merged commit a022483 into elastic:main Nov 28, 2023

jonathan-buttner deleted the ml-infer-results-format branch November 28, 2023 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Inference Service Refactoring results format #102429

[ML] Inference Service Refactoring results format #102429

jonathan-buttner commented Nov 21, 2023 •

edited

Loading

jonathan-buttner Nov 21, 2023

jonathan-buttner Nov 21, 2023

davidkyle Nov 28, 2023

jonathan-buttner Nov 21, 2023

jonathan-buttner Nov 21, 2023

davidkyle Nov 22, 2023

jonathan-buttner Nov 21, 2023 •

edited

Loading

davidkyle Nov 22, 2023

jonathan-buttner Nov 21, 2023

davidkyle Nov 22, 2023

elasticsearchmachine commented Nov 21, 2023

davidkyle commented Nov 22, 2023

davidkyle commented Nov 22, 2023

davidkyle Nov 22, 2023

davidkyle Nov 22, 2023

davidkyle Nov 22, 2023

davidkyle Nov 22, 2023

davidkyle Nov 22, 2023

davidkyle left a comment

davidkyle Nov 28, 2023

davidkyle Nov 28, 2023

jonathan-buttner Nov 28, 2023

davidkyle commented Nov 28, 2023

	public static final String NAME = "text_embedding_results_v2";
	public static final String NAME = "text_embedding_service_results";

	public static SparseEmbeddingResults create(List<? extends InferenceResults> results) {
	public static SparseEmbeddingResults of(List<? extends InferenceResults> results) {


		public class TestUtils {

		public static String toJsonString(ToXContentFragment entity) throws IOException {

		// Map<String, Object> sparseEmbeddingMap = new LinkedHashMap<>();
		// sparseEmbeddingMap.put(EMBEDDING, embeddingList);

[ML] Inference Service Refactoring results format #102429

[ML] Inference Service Refactoring results format #102429

Conversation

jonathan-buttner commented Nov 21, 2023 • edited Loading

Notable changes

LegacyTextEmbeddingResults

Format

TextEmbeddingResults

SparseEmbeddingResults

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan-buttner Nov 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Nov 21, 2023

davidkyle commented Nov 22, 2023

davidkyle commented Nov 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidkyle commented Nov 28, 2023

jonathan-buttner commented Nov 21, 2023 •

edited

Loading

jonathan-buttner Nov 21, 2023 •

edited

Loading