Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SD-119] Implement layer execution latency measurements for Pytorch #48

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

osw282
Copy link
Contributor

@osw282 osw282 commented Jan 15, 2025

This PR implement a script that will record the layer execution time of a pytorch model for inferencing using CPU only.

The script will output a json file containing the execution time, timestamp and the layer name for all inference cycles.

@osw282 osw282 self-assigned this Jan 15, 2025
Copy link
Contributor

dagshub bot commented Jan 15, 2025

@@ -7,4 +7,6 @@ requires-python = ">=3.11"
dependencies = [
"dvc-s3>=3.2.0",
"pandas>=2.2.3",
"pillow>=11.1.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this packaged being used anywhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It an implicit requirement by pytorch to run resnet18

module: the module to register hook.
input: tuple containing the input arguments to module's forward method.
"""
layer_time_dict[layer_name] = (time.time(), datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
Copy link
Member

@dudeperf3ct dudeperf3ct Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if time.time is a reliable function to get a time as it depends on system clock.

I think time.perf_counter or time.perf_counter_ns() makes more sense as these layers are going to be fast and we need more precise estimates (resolution offered by these functions is different).

Reference:

  1. https://medium.com/@jordan.l.edmunds/how-to-time-your-code-correctly-time-monotonic-e730dce49006
  2. https://stackoverflow.com/questions/75011155/how-to-get-the-time-in-ms-between-two-calls-in-python

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to worry about CPU implementation of timing too much, as the hardware we're using has cuda, and CUDA events will be used in the next ticket

layer_time_dict = {}

for layer_name, layer in get_layers(model):
layer.register_forward_pre_hook(partial(layer_time_pre_hook, layer_time_dict, layer_name))
Copy link
Member

@dudeperf3ct dudeperf3ct Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are hooks being used? Does profiler API not work? I think it will provide much better results on CPU and GPU.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand autograd profiler gives us the wrong resolution (it gives latencies by operation type, not by layer). But @osw282 can give more context I guess

Copy link
Member

@d-lowl d-lowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small nit about milliseconds, otherwise good to go I think

module: the module to register hook.
input: tuple containing the input arguments to module's forward method.
"""
layer_time_dict[layer_name] = (time.time(), datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B. these functions will need use CudaEvents from SD-118 in the actual benchmark script

module: the module to register hook.
input: tuple containing the input arguments to module's forward method.
"""
layer_time_dict[layer_name] = (time.time(), datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not enough precision for layer start time, we will need milliseconds too (layers in resnet18 take about 1.5ms to execute). Worth fixing here, to not forget to fix it in the next ticket with Cuda Events

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants