Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mango): add keys_examined for execution_stats (part 1) #4554

Closed

Conversation

pgj
Copy link
Contributor

@pgj pgj commented Apr 27, 2023

Add another field to the shard-level Mango execution statistics to keep track of the count of keys that were examined for the query. Note that this required to change the way how stats are stored -- an approach similar to that of the view callback arguments was chosen, which features a map (#4394).

At the same time, the changes in this pull request only add support for the receiver (coordinator) side to be able to parse messages of the updated format. It will have to be deployed first so that we could start sending such messages and avoid crashes for mixed-version clusters.

This is related to the covering indexes work and takes inspiration from earlier works of @mikerhodes , see #4410 and #4413 for the details.

Note that slight increase in unit test coverage for mango_execution_stats due to the implicit testing via mango_cursor_view. The only missing part is incr_docs_examined/1 but that is not used from there, only from mango_cursor_text and mango_cursor_nouveau .

$ make eunit apps=mango

before:

[..]
Code Coverage:
mango_app             :   0%
mango_crud            :   0%
mango_cursor          :  20%
mango_cursor_nouveau  :   0%
mango_cursor_special  :   0%
mango_cursor_text     :   0%
mango_cursor_view     : 100%
mango_doc             :   4%
mango_epi             : 100%
mango_error           :   0%
mango_execution_stats :  63%
mango_fields          :  57%
mango_httpd           :   0%
mango_httpd_handlers  :   0%
mango_idx             :  20%
mango_idx_nouveau     :  29%
mango_idx_special     :   0%
mango_idx_text        :  63%
mango_idx_view        :  40%
mango_json            :  12%
mango_json_bookmark   :  25%
mango_native_proc     :   4%
mango_opts            :  27%
mango_selector        :  66%
mango_selector_text   :  86%
mango_sort            :   5%
mango_sup             :   0%
mango_util            :  38%

Total                 : 46%

after:

[..]
Code Coverage:
mango_app             :   0%
mango_crud            :   0%
mango_cursor          :  20%
mango_cursor_nouveau  :   0%
mango_cursor_special  :   0%
mango_cursor_text     :   0%
mango_cursor_view     : 100%
mango_doc             :   4%
mango_epi             : 100%
mango_error           :   0%
mango_execution_stats :  94%
mango_fields          :  57%
mango_httpd           :   0%
mango_httpd_handlers  :   0%
mango_idx             :  20%
mango_idx_nouveau     :  29%
mango_idx_special     :   0%
mango_idx_text        :  63%
mango_idx_view        :  40%
mango_json            :  12%
mango_json_bookmark   :  25%
mango_native_proc     :   4%
mango_opts            :  27%
mango_selector        :  66%
mango_selector_text   :  86%
mango_sort            :   5%
mango_sup             :   0%
mango_util            :  38%

Total                 : 47%

Checklist

  • Code is written and works correctly
  • Changes are covered by tests

Add another field to the shard-level Mango execution statistics
to keep track of the count of keys that were examined for the
query.  Note that this required to change the way how stats are
stored -- an approach similar to that of the view callback
arguments was chosen, which features a map.

At the same time, this commit only adds support for the receiver
(coordinator) side to be able to parse messages of the updated
format.  It will have to be deployed first so that we could start
sending such messages and avoid crashes for mixed-version
clusters.
Copy link
Contributor

@nickva nickva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but see the small style nit first

Cursor1 = Cursor#cursor{
% TODO: remove clause couchdb 4 -- mixed-version upgrade support
handle_message(
{execution_stats, {docs_examined, DocsExamined}}, #cursor{execution_stats = Stats} = Cursor0
Copy link
Contributor

@nickva nickva May 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: It might look better to pick out Stats in a separate line after the function head, since with the {execution_stats, {docs_examined, DocsExamined}} the head is getting rather long.

Also, in the second clause it might be better to explicitly assert we're matching on a map.

handle_message({execution_stats, {docs_examined, DocsExamined}}, #cursor{} = Cursor0) ->
    #cursor{execution_stats = Stats} = Cursor0,
    ...
handle_message({execution_stats, #{} = ShardStats}, #cursor{} = Cursor0) ->
    #cursor{execution_stats = Stats} = Cursor0,
    ...  

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nick for the comments! I have had the reasons for these choices, let me add them for clarification:

  • The {execution_stats, {docs_examined, DocsExamined}} pattern was embedded into the function head to avoid the potential overlap with other clause. My understanding is that heads are checked top-down and the body of the first matching one is going to be evaluated. If there is no direct pattern to match on the contents of the argument of execution_stats (i.e. the second element in the tuple) in the head, the clause will be skipped and missed.

  • For the second clause on execution_stats, I avoided to match on the type of ShardStats to keep it opaque. Ideally, mango_cursor_view:shard_stats_get/2 will handle the map check eventually. I am aware that might be too late, but there are no other alternatives present to be confused with. At the same time, changes to the format of shard stats is encapsulated into the implementation of the shard_stats/2 and shard_stats_get/2 functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The {execution_stats, {docs_examined, DocsExamined}} pattern was embedded into the function head to avoid the potential overlap with other clause.

That's a good way to do it. The heads should go from the most specific one to the more general. The first comment wasn't about the order of clauses but about not having the line get too long. In the first head we're doing at least two unrelated things: match on {execution_stats, {docs_examined, DocsExamined}} and picking out Stats out of the #cursor. The idea of the cleanup is to pick out the Stats variable in a separate line.

I avoided to match on the type of ShardStats to keep it opaque.

I had noticed but it would still look better have an extra match since we're juggling two different formats. We're not breaking the opaqueness too much as we're not picking out individual keys from the map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the cleanup is to pick out the Stats variable in a separate line.

Oh, you are right! Yes, that is a good idea.

We're not breaking the opaqueness too much as we're not picking out individual keys from the map.

That one still forces the map type on the container. That is true that checking only if that is map would not interfere with the actual keys, but what if we wanted to move away from the maps for a different representation?

Copy link
Contributor

@nickva nickva May 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a trade-off: preserve complete opaqueness or provide immediate visual and runtime feedback that the type is wrong in that clause. If we switch away from a map, we'd get an immediate failure in testing. In a strongly-typed language it might be better not to be "leaky", but in Erlang matching on things we expect as early as possible is more common. But that's fine, if you feel strongly about it. It's +1 either, way. If you rebase again, I'll merge (to get a clean rebase in main).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have a strong feeling about it. As you have also put it, this might be kind of a habit that I acquired from strongly typed languages and I accept that the same may apply to Erlang. I can follow this idiom.

But please do not merge this PR for now. In response to this approach, I was suggested to handle the question of transition through configuration options and add support for both the sender and the receiver sides in a single change. I will file another PR for that alternative, and it is very likely that will be the winner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads-up, let me know we'll take another look at it later.

@pgj
Copy link
Contributor Author

pgj commented May 11, 2023

This change was replaced for #4569 hence I am now closing it.

@pgj pgj closed this May 11, 2023
@pgj pgj deleted the feat/mango-execution-stats-keys-examined-part-1 branch May 11, 2023 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants