-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sd 118 inference update morgan #47
Open
OCarrollM
wants to merge
4
commits into
develop
Choose a base branch
from
SD-118-InferenceUpdate-Morgan
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+477
−144
Open
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
32612cb
Add torch dev dependency to power_logging pyproject
d-lowl b5f179e
Execute benchmark script locally with the commented out code
d-lowl 260b31f
SD-118 Jetson Inference now uses PyTorch instead of TensorRT and uses…
OCarrollM 1186418
PR Review changes. Added logging and calculations
OCarrollM File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,8 @@ | ||
"""Benchmark TensorRT models.""" | ||
""" | ||
Benchmark PyTorch models. | ||
|
||
Script uses PyTorch to benchmark models and will support CUDA if it is available on the system | ||
""" | ||
|
||
import argparse | ||
import json | ||
|
@@ -10,12 +14,40 @@ | |
import numpy as np | ||
import torch | ||
import torch.backends.cudnn as cudnn | ||
import torch_tensorrt | ||
from pydantic import BaseModel | ||
from tqdm import tqdm | ||
|
||
from model.lenet import LeNet | ||
from model.trt_utils import CustomProfiler, save_engine_info, save_layer_wise_profiling | ||
|
||
""" | ||
Wrapper class for Torch.cuda.event for non-CUDA supported devices | ||
|
||
Methods: | ||
- record(): Records an event if CUDA is available | ||
- elapsed_time(): Calculates elapsed time between events | ||
- synchronize(): synchronizes events in instance of CUDA | ||
""" | ||
class CudaEvent: | ||
def __init__(self, enable_timing = True): | ||
if torch.cuda.is_available(): | ||
self.event = torch.cuda.Event(enable_timing=enable_timing) | ||
else: | ||
print("Warning: CUDA not available. Instance outimed.") | ||
self.event = None | ||
|
||
def record(self): | ||
if self.event: | ||
self.event.record() | ||
|
||
def elapsed_time(self, n_event): | ||
if self.event and n_event.event: | ||
return self.event.elapsed_time(n_event.event) | ||
return 0 | ||
|
||
def synchronize(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't need this here? Since we can just call |
||
if self.event: | ||
self.event.synchronize() | ||
|
||
|
||
cudnn.benchmark = True | ||
|
||
|
@@ -31,7 +63,7 @@ class BenchmarkMetrics(BaseModel): | |
avg_throughput: float | ||
|
||
|
||
def load_model(model_name: str) -> Any: | ||
def load_model(model_name: str, model_repo: str) -> Any: | ||
"""Load model from Pytorch Hub. | ||
|
||
Args: | ||
|
@@ -47,9 +79,9 @@ def load_model(model_name: str) -> Any: | |
if model_name == "lenet": | ||
return LeNet() | ||
if model_name == "fcn_resnet50": | ||
return torch.hub.load("pytorch/vision", model_name, pretrained=True) | ||
return torch.hub.load(model_repo, model_name, pretrained=True) | ||
try: | ||
return torch.hub.load("pytorch/vision", model_name, weights="IMAGENET1K_V1") | ||
return torch.hub.load(model_repo, model_name) | ||
except: | ||
raise ValueError( | ||
f"Model name: {model_name} is most likely incorrect. " | ||
|
@@ -66,13 +98,13 @@ def benchmark(args: argparse.Namespace) -> None: | |
Args: | ||
args: Arguments from CLI. | ||
""" | ||
start = torch.cuda.Event(enable_timing=True) | ||
end = torch.cuda.Event(enable_timing=True) | ||
start = CudaEvent(enable_timing=True) | ||
end = CudaEvent(enable_timing=True) | ||
start.record() | ||
|
||
timestamp = datetime.now().strftime("%Y%m%d-%H%M%S") | ||
input_data = torch.randn(args.input_shape, device=DEVICE) | ||
model = load_model(args.model) | ||
model = load_model(args.model, args.model_repo) | ||
model.eval().to(DEVICE) | ||
|
||
dtype = torch.float32 | ||
|
@@ -85,21 +117,6 @@ def benchmark(args: argparse.Namespace) -> None: | |
model = model.to(dtype) | ||
print(f"Using {DEVICE=} for benchmarking") | ||
|
||
exp_program = torch.export.export(model, tuple([input_data])) | ||
model = torch_tensorrt.dynamo.compile( | ||
exported_program=exp_program, | ||
inputs=[input_data], | ||
min_block_size=args.min_block_size, | ||
optimization_level=args.optimization_level, | ||
enabled_precisions={dtype}, | ||
# Set to True for verbose output | ||
# NOTE: Performance Regression when rich library is available | ||
# https://github.com/pytorch/TensorRT/issues/3215 | ||
debug=True, | ||
# Setting it to True returns PythonTorchTensorRTModule which has different profiling approach | ||
use_python_runtime=True, | ||
) | ||
|
||
st = time.perf_counter() | ||
print("Warm up ...") | ||
with torch.no_grad(): | ||
|
@@ -108,48 +125,25 @@ def benchmark(args: argparse.Namespace) -> None: | |
print(f"Warm complete in {time.perf_counter()-st:.2f} sec ...") | ||
|
||
print("Start timing using tensorrt backend ...") | ||
torch.cuda.synchronize() | ||
# Recorded in milliseconds | ||
start_events = [torch.cuda.Event(enable_timing=True) for _ in range(args.runs)] | ||
end_events = [torch.cuda.Event(enable_timing=True) for _ in range(args.runs)] | ||
|
||
|
||
with torch.no_grad(): | ||
for i in tqdm(range(args.runs)): | ||
# Hack for enabling profiling | ||
# https://github.com/pytorch/TensorRT/issues/1467 | ||
profiling_dir = f"{args.result_dir}/{args.model}/trt_profiling" | ||
Path(profiling_dir).mkdir(exist_ok=True, parents=True) | ||
|
||
# Records traces in milliseconds | ||
# https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Core/Profiler.html#tensorrt.Profiler | ||
mod = list(model.named_children())[0][1] | ||
mod.enable_profiling(profiler=CustomProfiler()) | ||
|
||
start_events[i].record() | ||
_ = model(input_data) | ||
end_events[i].record() | ||
d-lowl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
end.record() | ||
torch.cuda.synchronize() | ||
d-lowl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
save_layer_wise_profiling(mod, profiling_dir) | ||
save_engine_info(mod, profiling_dir) | ||
|
||
# Convert milliseconds to seconds | ||
timings = [s.elapsed_time(e) * 1.0e-3 for s, e in zip(start_events, end_events)] | ||
avg_throughput = args.input_shape[0] / np.mean(timings) | ||
print("Benchmarking complete ...") | ||
# Convert milliseconds to seconds | ||
total_exp_time = start.elapsed_time(end) * 1.0e-3 | ||
print(f"Total time for experiment: {total_exp_time} sec") | ||
|
||
|
||
results = BenchmarkMetrics( | ||
config=vars(args), | ||
total_time=total_exp_time, # in seconds | ||
total_time=0, #total_exp_time, # in seconds | ||
d-lowl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
timestamp=timestamp, | ||
latencies=timings, # in seconds | ||
avg_throughput=avg_throughput, | ||
avg_latency=np.mean(timings), # in seconds | ||
latencies=[], #timings, # in seconds | ||
avg_throughput=0, #avg_throughput, | ||
avg_latency=0, #np.mean(timings), # in seconds | ||
) | ||
|
||
model_dir = f"{args.result_dir}/{args.model}" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the create virtual environment using uv above and got this error when running the
measure_inference_power.py
script. Torch does't seem to be installed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, my uv version was outdated. Updating uv works now.