Skip to content

Commit

Permalink
fix top_hits query size
Browse files Browse the repository at this point in the history
`top_hits` aggregation operator accepts a max size of `100` (by default). This PR sets max result size to 100 if size > 100. The `top_hits` aggregator is used primarily in for queries with variables, which are primarily used for remote relationships by the engine

Tests for this change are not a part of this PR. They will be added in a later PR.
  • Loading branch information
m-Bilal authored Dec 20, 2024
2 parents 8562692 + 6ff4c1f commit a6ae4c3
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 2 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ Below, you'll find a matrix of all supported features for the Elasticsearch conn
| Nested Sorting || |
| Nested Relationships || |

> [!Note]
> Remote Relationships are currently implemented via `top_hits` operator. That operator has a default maximum result size limit of 100 rows. This is what the connector operates on. If you give the connector a higher limit, it will change that to 100 for compliance with the database. Also, since the returned result will contain only 100 rows per bucket, it may not represent the whole result.
## Before you get Started

1. Create a [Hasura Cloud account](https://console.hasura.io)
Expand Down
14 changes: 12 additions & 2 deletions connector/variables.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ func executeQueryWithVariables(variableSets []schema.QueryRequestVariablesElem,
// do not to return any documents in the search results while performing aggregations
variableQuery["size"] = 0

// 100 is the default max result size limit (per bucket) for top_hits aggregation
// This limit can be set by changing the [index.max_inner_result_window] index level setting.
// TODO: we should read this setting and set the limit accordingly
// A `bucket` here refers to a group of documents that match a certain clause/perdicate, and the top_hits aggregation can have multiple clauses/predicates
const TOP_HITS_MAX_BUCKET_RESULT_SIZE = 100

var filters []interface{}
if filter, ok := body["query"]; ok {
for _, variableSet := range variableSets {
Expand All @@ -28,9 +34,13 @@ func executeQueryWithVariables(variableSets []schema.QueryRequestVariablesElem,

topHits := make(map[string]interface{})
topHits["_source"] = body["_source"]
topHits["size"] = 10
topHits["size"] = TOP_HITS_MAX_BUCKET_RESULT_SIZE
if size, ok := body["size"]; ok {
topHits["size"] = size
if (size.(int)) > TOP_HITS_MAX_BUCKET_RESULT_SIZE {
topHits["size"] = TOP_HITS_MAX_BUCKET_RESULT_SIZE
} else {
topHits["size"] = size
}
}
if limit, ok := body["limit"]; ok {
topHits["from"] = limit
Expand Down

0 comments on commit a6ae4c3

Please sign in to comment.