-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mediapipe graph process time metric #2942
Conversation
08f22b8
to
8493488
Compare
9fd076f
to
8174bf7
Compare
2df7c56
to
ff2fa6b
Compare
@@ -196,6 +203,9 @@ class MediapipeGraphExecutor { | |||
INCREMENT_IF_ENABLED(this->mediapipeServableMetricReporter->getGraphErrorMetric(executionContext)); | |||
} | |||
MP_RETURN_ON_FAIL(status, "graph wait until done", mediapipeAbslToOvmsStatus(status.code())); | |||
timer.stop(PROCESS); | |||
double processTime = timer.template elapsed<std::chrono::microseconds>(PROCESS); | |||
OBSERVE_IF_ENABLED(this->mediapipeServableMetricReporter->getProcessingTimeMetric(executionContext), processTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear to me if we want or don't want to register processing times of failed execution @dtrawins @bstrzele
In either way, I don't think this is correct because:
a) we do not register time in all error return codepaths
b) we do register time in one specific error return codepath (when if (outputPollers.size() != outputPollersWithReceivedPacket.size()) {
)
docs/metrics.md
Outdated
@@ -241,6 +241,7 @@ For [MediaPipe Graphs](./mediapipe.md) execution there are 4 generic metrics whi | |||
| counter | ovms_responses | Useful to track number of packets generated by MediaPipe graph. Keep in mind that single request may trigger production of multiple (or zero) packets, therefore tracking number of responses is complementary to tracking accepted requests. For example tracking streaming partial responses of LLM text generation graphs. | | |||
| gauge | ovms_current_graphs | Number of graphs currently in-process. For unary communication it is equal to number of currently processing requests (each request initializes separate MediaPipe graph). For streaming communication it is equal to number of active client connections. Each connection is able to reuse the graph and decide when to delete it when the connection is closed. | | |||
| counter | ovms_graph_error | Counts errors in MediaPipe graph execution phase. For example V3 LLM text generation fails in LLMCalculator due to missing prompt - calculator returns an error and graph cancels. | | |||
| histogram | ovms_graph_processing_time_us | Time for which mediapipe graph was opened. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain more - for successful only? for failed as well? which fails count?
ae8dd5a
to
2455f22
Compare
f6c43b3
to
a51b6e7
Compare
docs/metrics.md
Outdated
@@ -241,6 +241,7 @@ For [MediaPipe Graphs](./mediapipe.md) execution there are 4 generic metrics whi | |||
| counter | ovms_responses | Useful to track number of packets generated by MediaPipe graph. Keep in mind that single request may trigger production of multiple (or zero) packets, therefore tracking number of responses is complementary to tracking accepted requests. For example tracking streaming partial responses of LLM text generation graphs. | | |||
| gauge | ovms_current_graphs | Number of graphs currently in-process. For unary communication it is equal to number of currently processing requests (each request initializes separate MediaPipe graph). For streaming communication it is equal to number of active client connections. Each connection is able to reuse the graph and decide when to delete it when the connection is closed. | | |||
| counter | ovms_graph_error | Counts errors in MediaPipe graph execution phase. For example V3 LLM text generation fails in LLMCalculator due to missing prompt - calculator returns an error and graph cancels. | | |||
| histogram | ovms_graph_processing_time_us | Time for which mediapipe graph was opened and has been successfully closed. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in this table we should have also a column indicating for which type of request is applicable each metric. Right know everything is mixed. Some are for MP graphs only and some for individual model only. It could be adjusted in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every metric in this table should only be relevant for mediapipe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this one probably could use come kind of indicator
b38186a
to
02d5b8e
Compare
02d5b8e
to
71d8530
Compare
Co-authored-by: Trawinski, Dariusz <[email protected]>
71d8530
to
f838549
Compare
🛠 Summary
CVS-160188
A Prometheus metric ovms_graph_processing_time_us for tracking graph processing of histogram type.
Uses the same buckets like for single model execution metric.
The metric tracks time allocated by the graph to serve a single user request for unary calls or the session time for the stream requests.
🧪 Checklist
``