-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSK-2247] Add unit tests for nme implementation #6
[GSK-2247] Add unit tests for nme implementation #6
Conversation
GSK-2247 Benchmark the NME implementation
Implement a unit-test that benchmark the NME implementation against one of the numbers given in https://paperswithcode.com/sota/facial-landmark-detection-on-300w. Ideally if we can find someone the NME calculated only for one of the images contained in the 300W dataset (instead of benchmarking against the total NME of all images). For reference for the NME definition, see section 3.1 of : https://arxiv.org/pdf/2210.07233v1.pdf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For traceability and generalisability, could we use the 2D
face_alignment
model on one of the benchmark images from the 300W to generateTEST_MARKS
andPREDICT_MARKS
? - I resolved the conficts due to last refactoring of tests but could you please also improve the coverage over the new validations we have for the metrics?
- If you think benchmarking with a NME number from the literature is tedious, we can drop it, and just rely on the fact we're testing the euclidean distances using different methods.
Thanks for the review @rabah-khalek !
Yes, I will try to add the fixture for the model and a test dataset with one image. There is already the resources on my side. They are ignored by my local git repo somehow:
Doing it today ;)
I think it's very interesting. I am just wondering how to compare with the NME numbers from the literature, if our calculation is based on one image. |
Yes I was thinking of using the whole 300W dataset |
…feature/gsk-2247-benchmark-the-nme-implementation
It seems that face alignment returns different results in Python 3.10 (passed) and Python 3.11 (not passed):
|
Compute the ME, d for normalization and NME from scratch and compare the results, using the first image of 300W indoor.