[GSK-2247] Add unit tests for nme implementation #6

Inokinoki · 2023-12-08T16:20:51Z

Compute the ME, d for normalization and NME from scratch and compare the results, using the first image of 300W indoor.

linear · 2023-12-08T16:20:55Z

GSK-2247 Benchmark the NME implementation

Implement a unit-test that benchmark the NME implementation against one of the numbers given in https://paperswithcode.com/sota/facial-landmark-detection-on-300w. Ideally if we can find someone the NME calculated only for one of the images contained in the 300W dataset (instead of benchmarking against the total NME of all images).

For reference for the NME definition, see section 3.1 of : https://arxiv.org/pdf/2210.07233v1.pdf

…ation

rabah-khalek

For traceability and generalisability, could we use the 2D face_alignment model on one of the benchmark images from the 300W to generate TEST_MARKS and PREDICT_MARKS?
I resolved the conficts due to last refactoring of tests but could you please also improve the coverage over the new validations we have for the metrics?
If you think benchmarking with a NME number from the literature is tedious, we can drop it, and just rely on the fact we're testing the euclidean distances using different methods.

…ation

Inokinoki · 2023-12-12T09:26:16Z

Thanks for the review @rabah-khalek !

For traceability and generalisability, could we use the 2D face_alignment model on one of the benchmark images from the 300W to generate TEST_MARKS and PREDICT_MARKS?

Yes, I will try to add the fixture for the model and a test dataset with one image. There is already the resources on my side. They are ignored by my local git repo somehow:

I resolved the conficts due to last refactoring of tests but could you please also improve the coverage over the new validations we have for the metrics?

Doing it today ;)

If you think benchmarking with a NME number from the literature is tedious, we can drop it, and just rely on the fact we're testing the euclidean distances using different methods.

I think it's very interesting. I am just wondering how to compare with the NME numbers from the literature, if our calculation is based on one image.
Their NME should be over the given test dataset, if I didn't misunderstand. Should we predict the full dataset with model in this case?

rabah-khalek · 2023-12-12T14:53:43Z

Their NME should be over the given test dataset, if I didn't misunderstand. Should we predict the full dataset with model in this case?

Yes I was thinking of using the whole 300W dataset

…ation

…feature/gsk-2247-benchmark-the-nme-implementation

…ation

Inokinoki · 2023-12-20T13:52:46Z

It seems that face alignment returns different results in Python 3.10 (passed) and Python 3.11 (not passed):

=========================== short test summary info ============================
FAILED tests/test_benchmark_nmes.py::test_face_alignment_model - assert False
 +  where False = <function isclose at 0x7f6b13d32270>(0.06233510979950631, 0.06287962278002229)
 +    where <function isclose at 0x7f6b13d32270> = np.isclose
=================== 1 failed, 12 passed in 112.01s (0:01:52) ===================

…ation

Inokinoki added 3 commits December 7, 2023 22:16

Add unit tests for _calculate_es

6232556

Add unit test for distance of NME

b9566d5

Add unit test for NME

2b53f37

rabah-khalek added 5 commits December 9, 2023 14:34

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

fa30136

…ation

fixed conflicts with main

cd8dd5b

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

63ed985

…ation

better handle of test_calculate_es_2d

bd5602d

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

5560c0d

…ation

rabah-khalek suggested changes Dec 9, 2023

View reviewed changes

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

3c49fc1

…ation

rabah-khalek marked this pull request as draft December 9, 2023 16:12

rabah-khalek and others added 11 commits December 12, 2023 18:04

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

9ac2dd7

…ation

Use fixture to load sample dataset for testing

3956a86

Add unit tests for ME mean, std and NME mean, std

73eb359

Unify calculation in test to reduce duplications

fa00915

Add test to benchmark face alignment and opencv

842bd26

Merge branch 'main' of https://github.com/Giskard-AI/loreal-poc into …

831d44b

…feature/gsk-2247-benchmark-the-nme-implementation

Update tests with new classes

13633a9

Extract full dataset from S3

a2af06b

Separate indoor and outdoor 300W datasets

57660a6

Fix path for decompressed datasets

b012d36

Only extract datasets if dir not exists

c2ff140

Inokinoki marked this pull request as ready for review December 18, 2023 18:10

Inokinoki added 3 commits December 19, 2023 13:09

Assert dataset nmes are not nan

181802b

Predict with batch in NMEs benchmarks

0387952

Only predict and compare the first 5 samples

d90e450

Inokinoki requested a review from rabah-khalek December 19, 2023 15:05

rabah-khalek and others added 3 commits December 20, 2023 06:06

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

5bf1d8e

…ation

Add cache for test resources

7efb5ec

Test the example images and compare to local value

6855e6a

rabah-khalek and others added 8 commits December 20, 2023 18:34

Merge branch 'main' into feature/gsk-2247-benchmark-the-nme-implement…

c912190

…ation

working on tests

898481b

Add conftest.py to include fixtures

cb53fee

refactoring

64df970

changed name of a test

bc88c98

Fix typo in test name

b26e6be

Use different cache for different python version

1c4384a

removing face_alignment from tests

e301ad4

rabah-khalek enabled auto-merge December 22, 2023 11:31

rabah-khalek approved these changes Dec 22, 2023

View reviewed changes

rabah-khalek merged commit 005239a into main Dec 22, 2023
3 checks passed

rabah-khalek deleted the feature/gsk-2247-benchmark-the-nme-implementation branch December 22, 2023 12:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GSK-2247] Add unit tests for nme implementation #6

[GSK-2247] Add unit tests for nme implementation #6

Inokinoki commented Dec 8, 2023

linear bot commented Dec 8, 2023

rabah-khalek left a comment

Inokinoki commented Dec 12, 2023

rabah-khalek commented Dec 12, 2023 •

edited

Loading

Inokinoki commented Dec 20, 2023

[GSK-2247] Add unit tests for nme implementation #6

[GSK-2247] Add unit tests for nme implementation #6

Conversation

Inokinoki commented Dec 8, 2023

linear bot commented Dec 8, 2023

rabah-khalek left a comment

Choose a reason for hiding this comment

Inokinoki commented Dec 12, 2023

rabah-khalek commented Dec 12, 2023 • edited Loading

Inokinoki commented Dec 20, 2023

rabah-khalek commented Dec 12, 2023 •

edited

Loading