feat(`mango`): add `keys_examined` for `execution_stats` (part 1) #4554

pgj · 2023-04-27T19:20:10Z

Add another field to the shard-level Mango execution statistics to keep track of the count of keys that were examined for the query. Note that this required to change the way how stats are stored -- an approach similar to that of the view callback arguments was chosen, which features a map (#4394).

At the same time, the changes in this pull request only add support for the receiver (coordinator) side to be able to parse messages of the updated format. It will have to be deployed first so that we could start sending such messages and avoid crashes for mixed-version clusters.

This is related to the covering indexes work and takes inspiration from earlier works of @mikerhodes , see #4410 and #4413 for the details.

Note that slight increase in unit test coverage for mango_execution_stats due to the implicit testing via mango_cursor_view. The only missing part is incr_docs_examined/1 but that is not used from there, only from mango_cursor_text and mango_cursor_nouveau .

$ make eunit apps=mango

before:

[..]
Code Coverage:
mango_app             :   0%
mango_crud            :   0%
mango_cursor          :  20%
mango_cursor_nouveau  :   0%
mango_cursor_special  :   0%
mango_cursor_text     :   0%
mango_cursor_view     : 100%
mango_doc             :   4%
mango_epi             : 100%
mango_error           :   0%
mango_execution_stats :  63%
mango_fields          :  57%
mango_httpd           :   0%
mango_httpd_handlers  :   0%
mango_idx             :  20%
mango_idx_nouveau     :  29%
mango_idx_special     :   0%
mango_idx_text        :  63%
mango_idx_view        :  40%
mango_json            :  12%
mango_json_bookmark   :  25%
mango_native_proc     :   4%
mango_opts            :  27%
mango_selector        :  66%
mango_selector_text   :  86%
mango_sort            :   5%
mango_sup             :   0%
mango_util            :  38%

Total                 : 46%

after:

[..]
Code Coverage:
mango_app             :   0%
mango_crud            :   0%
mango_cursor          :  20%
mango_cursor_nouveau  :   0%
mango_cursor_special  :   0%
mango_cursor_text     :   0%
mango_cursor_view     : 100%
mango_doc             :   4%
mango_epi             : 100%
mango_error           :   0%
mango_execution_stats :  94%
mango_fields          :  57%
mango_httpd           :   0%
mango_httpd_handlers  :   0%
mango_idx             :  20%
mango_idx_nouveau     :  29%
mango_idx_special     :   0%
mango_idx_text        :  63%
mango_idx_view        :  40%
mango_json            :  12%
mango_json_bookmark   :  25%
mango_native_proc     :   4%
mango_opts            :  27%
mango_selector        :  66%
mango_selector_text   :  86%
mango_sort            :   5%
mango_sup             :   0%
mango_util            :  38%

Total                 : 47%

Checklist

Code is written and works correctly
Changes are covered by tests

Add another field to the shard-level Mango execution statistics to keep track of the count of keys that were examined for the query. Note that this required to change the way how stats are stored -- an approach similar to that of the view callback arguments was chosen, which features a map. At the same time, this commit only adds support for the receiver (coordinator) side to be able to parse messages of the updated format. It will have to be deployed first so that we could start sending such messages and avoid crashes for mixed-version clusters.

nickva

+1 but see the small style nit first

nickva · 2023-05-01T15:13:53Z

src/mango/src/mango_cursor_view.erl

-    Cursor1 = Cursor#cursor{
+% TODO: remove clause couchdb 4 -- mixed-version upgrade support
+handle_message(
+    {execution_stats, {docs_examined, DocsExamined}}, #cursor{execution_stats = Stats} = Cursor0


Tiny nit: It might look better to pick out Stats in a separate line after the function head, since with the {execution_stats, {docs_examined, DocsExamined}} the head is getting rather long.

Also, in the second clause it might be better to explicitly assert we're matching on a map.

handle_message({execution_stats, {docs_examined, DocsExamined}}, #cursor{} = Cursor0) -> #cursor{execution_stats = Stats} = Cursor0, ... handle_message({execution_stats, #{} = ShardStats}, #cursor{} = Cursor0) -> #cursor{execution_stats = Stats} = Cursor0, ...

Thanks Nick for the comments! I have had the reasons for these choices, let me add them for clarification:

The {execution_stats, {docs_examined, DocsExamined}} pattern was embedded into the function head to avoid the potential overlap with other clause. My understanding is that heads are checked top-down and the body of the first matching one is going to be evaluated. If there is no direct pattern to match on the contents of the argument of execution_stats (i.e. the second element in the tuple) in the head, the clause will be skipped and missed.

For the second clause on execution_stats, I avoided to match on the type of ShardStats to keep it opaque. Ideally, mango_cursor_view:shard_stats_get/2 will handle the map check eventually. I am aware that might be too late, but there are no other alternatives present to be confused with. At the same time, changes to the format of shard stats is encapsulated into the implementation of the shard_stats/2 and shard_stats_get/2 functions.

The {execution_stats, {docs_examined, DocsExamined}} pattern was embedded into the function head to avoid the potential overlap with other clause.

That's a good way to do it. The heads should go from the most specific one to the more general. The first comment wasn't about the order of clauses but about not having the line get too long. In the first head we're doing at least two unrelated things: match on {execution_stats, {docs_examined, DocsExamined}} and picking out Stats out of the #cursor. The idea of the cleanup is to pick out the Stats variable in a separate line.

I avoided to match on the type of ShardStats to keep it opaque.

I had noticed but it would still look better have an extra match since we're juggling two different formats. We're not breaking the opaqueness too much as we're not picking out individual keys from the map.

The idea of the cleanup is to pick out the Stats variable in a separate line.

Oh, you are right! Yes, that is a good idea.

We're not breaking the opaqueness too much as we're not picking out individual keys from the map.

That one still forces the map type on the container. That is true that checking only if that is map would not interfere with the actual keys, but what if we wanted to move away from the maps for a different representation?

It's a trade-off: preserve complete opaqueness or provide immediate visual and runtime feedback that the type is wrong in that clause. If we switch away from a map, we'd get an immediate failure in testing. In a strongly-typed language it might be better not to be "leaky", but in Erlang matching on things we expect as early as possible is more common. But that's fine, if you feel strongly about it. It's +1 either, way. If you rebase again, I'll merge (to get a clean rebase in main).

I do not have a strong feeling about it. As you have also put it, this might be kind of a habit that I acquired from strongly typed languages and I accept that the same may apply to Erlang. I can follow this idiom.

But please do not merge this PR for now. In response to this approach, I was suggested to handle the question of transition through configuration options and add support for both the sender and the receiver sides in a single change. I will file another PR for that alternative, and it is very likely that will be the winner.

Thanks for the heads-up, let me know we'll take another look at it later.

pgj · 2023-05-11T02:03:13Z

This change was replaced for #4569 hence I am now closing it.

nickva approved these changes May 1, 2023

View reviewed changes

pgj closed this May 11, 2023

pgj deleted the feat/mango-execution-stats-keys-examined-part-1 branch May 11, 2023 02:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(`mango`): add `keys_examined` for `execution_stats` (part 1) #4554

feat(`mango`): add `keys_examined` for `execution_stats` (part 1) #4554

pgj commented Apr 27, 2023 •

edited

Loading

nickva left a comment

nickva May 1, 2023 •

edited

Loading

pgj May 2, 2023

nickva May 2, 2023

pgj May 2, 2023

nickva May 2, 2023 •

edited

Loading

pgj May 2, 2023

nickva May 2, 2023

pgj commented May 11, 2023

feat(mango): add keys_examined for execution_stats (part 1) #4554

feat(mango): add keys_examined for execution_stats (part 1) #4554

Conversation

pgj commented Apr 27, 2023 • edited Loading

Checklist

nickva left a comment

Choose a reason for hiding this comment

nickva May 1, 2023 • edited Loading

Choose a reason for hiding this comment

pgj May 2, 2023

Choose a reason for hiding this comment

nickva May 2, 2023

Choose a reason for hiding this comment

pgj May 2, 2023

Choose a reason for hiding this comment

nickva May 2, 2023 • edited Loading

Choose a reason for hiding this comment

pgj May 2, 2023

Choose a reason for hiding this comment

nickva May 2, 2023

Choose a reason for hiding this comment

pgj commented May 11, 2023

feat(`mango`): add `keys_examined` for `execution_stats` (part 1) #4554

feat(`mango`): add `keys_examined` for `execution_stats` (part 1) #4554

pgj commented Apr 27, 2023 •

edited

Loading

nickva May 1, 2023 •

edited

Loading

nickva May 2, 2023 •

edited

Loading