-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Inference Service Refactoring results format #102429
[ML] Inference Service Refactoring results format #102429
Conversation
@@ -183,45 +188,99 @@ public Request build() { | |||
|
|||
public static class Response extends ActionResponse implements ToXContentObject { | |||
|
|||
private final List<? extends InferenceResults> results; | |||
private final InferenceServiceResults results; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this class need a thorough review 😬
* @deprecated use {@link TextEmbeddingResults} instead | ||
*/ | ||
@Deprecated | ||
public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could fold this into the TextEmbeddingResults
class by simply making it implement InferenceResults
as well. I wasn't sure if that was less clear though or how that'd impact adding optional fields in the future. Or how it'd impact us if the InferenceResults
interface changes in the future. I'm open to other ideas though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to keep this as a separate class rather than cluttering up TextEmbeddingResults
*/ | ||
@Deprecated | ||
public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults { | ||
public static final String NAME = "text_embedding_results"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the name that was used in the openai PR here
|
||
for (InferenceResults result : results) { | ||
if (result instanceof TextExpansionResults expansionResults) { | ||
isTruncated |= expansionResults.isTruncated(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If for some reason only 1 of the results is truncated we'll mark them all as truncated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the structure would be a list of objects each with is_truncated
and embedding
properties.
{
"sparse_embedding": [
{
"is_truncated": false,
"embedding": [
{
"token": 0.1
},
...
]
},
{
"is_truncated": true,
"embedding": [
{
"token": 2.0
},
...
]
}
]
}
The matches the text_embedding
result structure
public static final String NAME = "text_embedding_results"; | ||
public record TextEmbeddingResults(List<Embedding> embeddings) implements InferenceServiceResults { | ||
// TODO: what should the name be here? | ||
public static final String NAME = "text_embedding_results_v2"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on what to use for the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final String NAME = "text_embedding_results_v2"; | |
public static final String NAME = "text_embedding_service_results"; |
🤷
@@ -35,7 +36,12 @@ protected Writeable.Reader<InferenceAction.Response> instanceReader() { | |||
|
|||
@Override | |||
protected InferenceAction.Response createTestInstance() { | |||
return new InferenceAction.Response(List.of(TextExpansionResultsTests.createRandomResults())); | |||
var result = switch (randomIntBetween(0, 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other tests that we should have to get coverage over that if-block in InferenceAction.Response
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if-block can be tested explicitly if you use AbstractBWCWireSerializationTestCase
as the test base class.
Implement the method mutateInstanceForVersion
and you can simulate mixed version transport comms and assert on the expected output.
AbstractBWCWireSerializationTestCase
is in xpack core and should be accessible here
Pinging @elastic/ml-core (Team:ML) |
@elasticmachine test this please |
1 similar comment
@elasticmachine test this please |
|
||
for (InferenceResults result : results) { | ||
if (result instanceof TextExpansionResults expansionResults) { | ||
isTruncated |= expansionResults.isTruncated(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the structure would be a list of objects each with is_truncated
and embedding
properties.
{
"sparse_embedding": [
{
"is_truncated": false,
"embedding": [
{
"token": 0.1
},
...
]
},
{
"is_truncated": true,
"embedding": [
{
"token": 2.0
},
...
]
}
]
}
The matches the text_embedding
result structure
public static final String NAME = "text_embedding_results"; | ||
public record TextEmbeddingResults(List<Embedding> embeddings) implements InferenceServiceResults { | ||
// TODO: what should the name be here? | ||
public static final String NAME = "text_embedding_results_v2"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final String NAME = "text_embedding_results_v2"; | |
public static final String NAME = "text_embedding_service_results"; |
🤷
this(in.readCollectionAsList(Embedding::new), in.readBoolean()); | ||
} | ||
|
||
public static SparseEmbeddingResults create(List<? extends InferenceResults> results) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static SparseEmbeddingResults create(List<? extends InferenceResults> results) { | |
public static SparseEmbeddingResults of(List<? extends InferenceResults> results) { |
of
is more idiomatic of Java
|
||
public class TestUtils { | ||
|
||
public static String toJsonString(ToXContentFragment entity) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strings.toString
handles ToXContentFragment
and pretty printing, is there a reason not to use that?
...plugin/inference/src/main/java/org/elasticsearch/xpack/inference/action/InferenceAction.java
Show resolved
Hide resolved
@@ -35,7 +36,12 @@ protected Writeable.Reader<InferenceAction.Response> instanceReader() { | |||
|
|||
@Override | |||
protected InferenceAction.Response createTestInstance() { | |||
return new InferenceAction.Response(List.of(TextExpansionResultsTests.createRandomResults())); | |||
var result = switch (randomIntBetween(0, 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The if-block can be tested explicitly if you use AbstractBWCWireSerializationTestCase
as the test base class.
Implement the method mutateInstanceForVersion
and you can simulate mixed version transport comms and assert on the expected output.
AbstractBWCWireSerializationTestCase
is in xpack core and should be accessible here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* @deprecated use {@link TextEmbeddingResults} instead | ||
*/ | ||
@Deprecated | ||
public record LegacyTextEmbeddingResults(List<Embedding> embeddings) implements InferenceResults { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to keep this as a separate class rather than cluttering up TextEmbeddingResults
// Map<String, Object> sparseEmbeddingMap = new LinkedHashMap<>(); | ||
// sparseEmbeddingMap.put(EMBEDDING, embeddingList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Map<String, Object> sparseEmbeddingMap = new LinkedHashMap<>(); | |
// sparseEmbeddingMap.put(EMBEDDING, embeddingList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops thanks.
@elasticmachine update branch |
…/elasticsearch into ml-infer-results-format
* Adding results * Fixing merge issues * Understanding the complexity * Making progress on tests * Tests working * Some comments * More comments * Addressing pr feedback * Fixing test * Fixing test * Fixing up comments and dead code --------- Co-authored-by: Elastic Machine <[email protected]>
Specifies the results format for the inference plugin's services. This should match the results proposed here: elastic/elasticsearch-specification#2329
Notable changes
InferenceServiceResults
that the formats implement to give us some more freedom (rather than using the ml plugin'sInferenceResults
LegacyTextEmbeddingResults
to represent the format of the openai response when it used theInferenceResults
TextEmbeddingResults
that adheres to the newInferenceServiceResults
interfaceSparseEmbeddingResults
that adheres to the newInferenceServiceResults
interfaceSparseEmbeddingResults
, openai usesTextEmbeddingResults
LegacyTextEmbeddingResults
The
LegacyTextEmbeddingResults
is functionally identical to the newTextEmbeddingResults
. The legacy one implementsInferenceResults
and the new one implementsInferenceServiceResults
. I wasn't sure if it made sense to separate them or simply have one class, sayTextEmbeddingResults
that implemented bothInferenceResults
andInferenceServiceResults
. I'm open to either approach. I figured it might be more clear that we're moving away fromInferenceResults
if I had that class marked as deprecated.Format
TextEmbeddingResults
SparseEmbeddingResults