-
-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird Results After Exporting to TensorRT FP16 #104
Comments
I ran additional tests here. Considering the models that were generated in the same training procedure, the issue of multiple detections using the FP16 TensorRT model does not occur with the models generated after the initial epochs.
I wonder if this problem would be solved if I activate the |
|
Yes. In the case, I wonder if this problem would be solved if I activate the |
I think it does not. |
Sorry for the delay.
The warnings are probably related to the problem that I am facing. |
@lyuwenyu Also if we need to preprocess the image can you give me the exact code which we need to use before passing? |
I trained a model with a custom dataset using the PyTorch code from this repository. The training went well, and the Torch model worked as expected. After this test, I tried to export the model to ONNX. Again, everything went well, and the model worked as expected. Lastly, I tried to export the model to TensorRT. I exported two models, one using FP16 precision and the second using FP32 precision. There were no error logs during the export procedure.
When I tested the models, the FP32 one generated the same results as the ONNX model, while the FP16 one generated very distinct results compared to them. I noticed that the results from the FP16 model contained multiple (quite a few) bounding boxes for the same object. I found these differences between the models quite strange, considering that I did this procedure on a variety of different models, and the impact on the results was minimal. I suppose those differences could be removed by applying non-maximum suppression (NMS), but I didn't want to do that.
Does anyone know what might be causing this? Or at least how to fix it?
About some of the configurations that I used:
ONNX:
Exported using opset=16
onnx==1.14.0
onnxruntime==1.15.1
onnxsim==0.4.33
torch==2.0.1
torchvision==0.15.2
TensorRT:
I used the container nvcr.io/nvidia/tensorrt:23.01-py3, which includes:
TensorRT==8.5.2.2
The text was updated successfully, but these errors were encountered: