-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mango
): add keys_examined
for execution_stats
(part 1)
#4554
feat(mango
): add keys_examined
for execution_stats
(part 1)
#4554
Conversation
Add another field to the shard-level Mango execution statistics to keep track of the count of keys that were examined for the query. Note that this required to change the way how stats are stored -- an approach similar to that of the view callback arguments was chosen, which features a map. At the same time, this commit only adds support for the receiver (coordinator) side to be able to parse messages of the updated format. It will have to be deployed first so that we could start sending such messages and avoid crashes for mixed-version clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 but see the small style nit first
Cursor1 = Cursor#cursor{ | ||
% TODO: remove clause couchdb 4 -- mixed-version upgrade support | ||
handle_message( | ||
{execution_stats, {docs_examined, DocsExamined}}, #cursor{execution_stats = Stats} = Cursor0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny nit: It might look better to pick out Stats
in a separate line after the function head, since with the {execution_stats, {docs_examined, DocsExamined}}
the head is getting rather long.
Also, in the second clause it might be better to explicitly assert we're matching on a map.
handle_message({execution_stats, {docs_examined, DocsExamined}}, #cursor{} = Cursor0) ->
#cursor{execution_stats = Stats} = Cursor0,
...
handle_message({execution_stats, #{} = ShardStats}, #cursor{} = Cursor0) ->
#cursor{execution_stats = Stats} = Cursor0,
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nick for the comments! I have had the reasons for these choices, let me add them for clarification:
-
The
{execution_stats, {docs_examined, DocsExamined}}
pattern was embedded into the function head to avoid the potential overlap with other clause. My understanding is that heads are checked top-down and the body of the first matching one is going to be evaluated. If there is no direct pattern to match on the contents of the argument ofexecution_stats
(i.e. the second element in the tuple) in the head, the clause will be skipped and missed. -
For the second clause on
execution_stats
, I avoided to match on the type ofShardStats
to keep it opaque. Ideally,mango_cursor_view:shard_stats_get/2
will handle the map check eventually. I am aware that might be too late, but there are no other alternatives present to be confused with. At the same time, changes to the format of shard stats is encapsulated into the implementation of theshard_stats/2
andshard_stats_get/2
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The {execution_stats, {docs_examined, DocsExamined}} pattern was embedded into the function head to avoid the potential overlap with other clause.
That's a good way to do it. The heads should go from the most specific one to the more general. The first comment wasn't about the order of clauses but about not having the line get too long. In the first head we're doing at least two unrelated things: match on {execution_stats, {docs_examined, DocsExamined}}
and picking out Stats
out of the #cursor
. The idea of the cleanup is to pick out the Stats
variable in a separate line.
I avoided to match on the type of ShardStats to keep it opaque.
I had noticed but it would still look better have an extra match since we're juggling two different formats. We're not breaking the opaqueness too much as we're not picking out individual keys from the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of the cleanup is to pick out the
Stats
variable in a separate line.
Oh, you are right! Yes, that is a good idea.
We're not breaking the opaqueness too much as we're not picking out individual keys from the map.
That one still forces the map type on the container. That is true that checking only if that is map would not interfere with the actual keys, but what if we wanted to move away from the maps for a different representation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a trade-off: preserve complete opaqueness or provide immediate visual and runtime feedback that the type is wrong in that clause. If we switch away from a map, we'd get an immediate failure in testing. In a strongly-typed language it might be better not to be "leaky", but in Erlang matching on things we expect as early as possible is more common. But that's fine, if you feel strongly about it. It's +1 either, way. If you rebase again, I'll merge (to get a clean rebase in main).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have a strong feeling about it. As you have also put it, this might be kind of a habit that I acquired from strongly typed languages and I accept that the same may apply to Erlang. I can follow this idiom.
But please do not merge this PR for now. In response to this approach, I was suggested to handle the question of transition through configuration options and add support for both the sender and the receiver sides in a single change. I will file another PR for that alternative, and it is very likely that will be the winner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads-up, let me know we'll take another look at it later.
This change was replaced for #4569 hence I am now closing it. |
Add another field to the shard-level Mango execution statistics to keep track of the count of keys that were examined for the query. Note that this required to change the way how stats are stored -- an approach similar to that of the view callback arguments was chosen, which features a map (#4394).
At the same time, the changes in this pull request only add support for the receiver (coordinator) side to be able to parse messages of the updated format. It will have to be deployed first so that we could start sending such messages and avoid crashes for mixed-version clusters.
This is related to the covering indexes work and takes inspiration from earlier works of @mikerhodes , see #4410 and #4413 for the details.
Note that slight increase in unit test coverage for
mango_execution_stats
due to the implicit testing viamango_cursor_view
. The only missing part isincr_docs_examined/1
but that is not used from there, only frommango_cursor_text
andmango_cursor_nouveau
.$ make eunit apps=mango
before:
after:
Checklist