-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor bwc test suite to re-use existing resources between tests #1171
base: main
Are you sure you want to change the base?
refactor bwc test suite to re-use existing resources between tests #1171
Conversation
Signed-off-by: will-hwang <[email protected]>
Signed-off-by: will-hwang <[email protected]>
dbb1174
to
69edfc7
Compare
Signed-off-by: will-hwang <[email protected]>
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1171 +/- ##
===========================================
Coverage 81.72% 81.72%
+ Complexity 2494 1247 -1247
===========================================
Files 186 93 -93
Lines 8426 4213 -4213
Branches 1428 714 -714
===========================================
- Hits 6886 3443 -3443
+ Misses 1000 500 -500
+ Partials 540 270 -270 ☔ View full report in Codecov by Sentry. |
Signed-off-by: will-hwang <[email protected]>
Why do we need this change? Each test should be run independently in a dedicated environment with fresh, uncontaminated data, preventing interference from other tests and ensuring the test accurately evaluates the functionality of the component being tested in isolation. |
We have faced earlier issues with this where during the test run due to shared names the pipeline either used to get deleted or wrong information has been shared amongst test. |
@@ -28,9 +28,9 @@ public void testSemanticSearch_E2EFlow() throws Exception { | |||
waitForClusterHealthGreen(NODES_BWC_CLUSTER, 90); | |||
switch (getClusterType()) { | |||
case OLD: | |||
modelId = uploadTextEmbeddingModel(); | |||
modelId = getOrUploadTextEmbeddingModel(getIngestionPipeline(PIPELINE_NAME), TEXT_EMBEDDING_PROCESSOR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of OLD it will always upload. Then why it is changed to getOrUpload?
@@ -373,11 +373,11 @@ private static int getHitCount(final Map<String, Object> searchResponseAsMap) { | |||
} | |||
|
|||
public static String getModelId(Map<String, Object> pipeline, String processor) { | |||
assertNotNull(pipeline); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition validates that pipeline created earlier has the model Id. At line 377 we are fetching information from a pipeline. So we need to ensure that it is not null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it is a validation condition
@@ -373,11 +373,11 @@ private static int getHitCount(final Map<String, Object> searchResponseAsMap) { | |||
} | |||
|
|||
public static String getModelId(Map<String, Object> pipeline, String processor) { | |||
assertNotNull(pipeline); | |||
if (pipeline == null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In anycase, it will not be null.
I suggest reusing only the model. Pipeline is cheap to create. We likely need just one model per type: @vibrantvarun The reason for reusing the model is that during BWC tests, multiple tests deploy the same model simultaneously, leading to model deployment failures during node upgrades. This, in turn, causes flaky test failures. |
@heemin32 I think test resources created for any test should be just meant to run it. The root cause of the model deployment failure is not the redeployment of the model. It should be something else. |
Our goal isn’t to test model deployment—that should be covered by Model deployment is resource-intensive, and if each test deploys its own model, the test cluster might not be large enough to handle them all at once. Sharing models across tests shouldn’t affect test coverage or validity in any way. The primary objective here is to avoid flaky test failures caused by model deployment issues. We also want to ensure that problems in |
@heemin32 in bwc we do test the model deployed in older version of node still exists in new version of node. Because of that test we are always able to find critical issues with ml-commons just before the release. |
FYI: ml-commons till data does not have BWC framework in place. We as neural-search catch issues in BWC tests. |
@heemin32 I am aligned with you on reducing test flakiness but we need to find the actual root cause. Here we are trying to reduce overhead on us by doing less model deployment at the cost of removing model deployment test in bwc. Model deployment is part of each feature we release. |
Description
Currently, models and pipelines are re-created and re-deployed per test, leading to redundant model loads and pipeline creations. This change avoids this redundancy by reusing the created resource between test cases.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.