Fix for frequency in multi-study view #11331

eugeniomazzone · 2025-01-13T15:58:16Z

Describe changes proposed in this pull request:

Function AlterationCountServiceUtil.setupAlterationGeneCountsMap now initialize the profiledCases to 0.
Added an additional piece of code (after above function is called) in AlterationCountServiceImpl to update the number of profiled samples with the correct one, for each study and each gene.

After the fix, the frequency displayed on the multi-study view appear to be correct as it's 23.8% as expected instead of 100%.

Before fix:

After fix:

Rebase to original branch

zeynepkaragoz · 2025-01-13T16:17:56Z

@inodb Eugenio worked on fixing one of the open issues, could you point us to some potential reviewers please?

inodb · 2025-01-13T19:56:35Z

@haynescd @alisman this is a fix in legacy code for determining alteration frequency in multi-study query. Thoughts?

haynescd · 2025-01-17T19:33:29Z

src/main/java/org/cbioportal/service/impl/AlterationCountServiceImpl.java

+                    List<S> studyAlterationCountByGenes = dataFetcher.apply(studyMolecularProfileCaseIdentifiers);
+                    if (includeFrequency) {
+                        Long studyProfiledCasesCount = includeFrequencyFunction.apply(studyMolecularProfileCaseIdentifiers, studyAlterationCountByGenes);
+                        profiledCasesCount.updateAndGet(v -> v + studyProfiledCasesCount);
+                    }
+                    Map<String, S> studyResult = new HashMap<>();
+                    studyAlterationCountByGenes.forEach(datum -> {
+                        String key = datum.getUniqueEventKey();
+                        studyResult.put(key, datum);
+                    });
+                    List<S>  allGene= new ArrayList<>(totalResult.values());
+                    allGene.forEach(datum -> {
+                        String key = datum.getUniqueEventKey();
+                        S alterationCountByGene = totalResult.get(key);
+                        alterationCountByGene.setNumberOfProfiledCases(alterationCountByGene.getNumberOfProfiledCases() + studyMolecularProfileCaseIdentifiers.size());
+                        Set<String> matchingGenePanelIds = new HashSet<>();
+                        if (!alterationCountByGene.getMatchingGenePanelIds().isEmpty()) {
+                            matchingGenePanelIds.addAll(alterationCountByGene.getMatchingGenePanelIds());
+                        }
+                        if (!datum.getMatchingGenePanelIds().isEmpty()) {
+                            matchingGenePanelIds.addAll(datum.getMatchingGenePanelIds());
+                        }
+                        alterationCountByGene.setMatchingGenePanelIds(matchingGenePanelIds);
+                        totalResult.put(key, alterationCountByGene);
+                    });
+                });


Lets break this out into functions instead of one big anonymous function. Also, can we add some comments

Moved the code to a separate function updateAlterationGeneCountsMap mimicking setupAlterationGeneCountsMap (see new commit at https://github.com/eugeniomazzone/cbioportal/tree/master)

haynescd · 2025-01-17T19:34:43Z

src/main/java/org/cbioportal/service/util/AlterationCountServiceUtil.java

@@ -148,7 +148,8 @@ public static <S extends AlterationCountBase> void setupAlterationGeneCountsMap(
                S alterationCountByGene = totalResult.get(key);
                alterationCountByGene.setTotalCount(alterationCountByGene.getTotalCount() + datum.getTotalCount());
                alterationCountByGene.setNumberOfAlteredCases(alterationCountByGene.getNumberOfAlteredCases() + datum.getNumberOfAlteredCases());
-                alterationCountByGene.setNumberOfProfiledCases(alterationCountByGene.getNumberOfProfiledCases() + datum.getNumberOfProfiledCases());
+                alterationCountByGene.setNumberOfProfiledCases(0);
+                //alterationCountByGene.setNumberOfProfiledCases(alterationCountByGene.getNumberOfProfiledCases() + datum.getNumberOfProfiledCases());


Why is it just commented out?

Old line is now removed (see new commit at https://github.com/eugeniomazzone/cbioportal/tree/master)

fuzhaoyuan · 2025-01-22T18:45:56Z

Tests are not passing. Could you run mvn clean package in your local environment? @eugeniomazzone

alisman · 2025-01-24T20:13:58Z

It looks like legacy is not reporting "not_profiled" status correctly. Compare the following curl with /api/ vs /api/column-store/ and you'll see the difference. The legacy implementation in master does report the not-profiled samples.

curl 'http://localhost:8082/api/column-store/mutation-data-counts/fetch?projection=SUMMARY' \
  -H 'Accept: application/json' \
  -H 'Accept-Language: en-US,en;q=0.9' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: application/json' \
  -H 'Cookie: JSESSIONID=0B4EB4ED93E52DB8CA6991CBFAAA88CB' \
  -H 'Origin: http://localhost:8082' \
  -H 'Pragma: no-cache' \
  -H 'Referer: http://localhost:8082/study/summary?id=brca_tcga_pan_can_atlas_2018' \
  -H 'Sec-Fetch-Dest: empty' \
  -H 'Sec-Fetch-Mode: cors' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36' \
  -H 'sec-ch-ua: "Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "macOS"' \
  --data-raw '{"genomicDataFilters":[{"hugoGeneSymbol":"EGFR","profileType":"mutations"}],"studyViewFilter":{"mutationDataFilters":[{"hugoGeneSymbol":"EGFR","profileType":"mutations","values":[[{"value":"NOT_MUTATED"}]],"categorization":"MUTATED"}],"studyIds":["brca_tcga_pan_can_atlas_2018"],"alterationFilter":{"copyNumberAlterationEventTypes":{"AMP":true,"HOMDEL":true},"mutationEventTypes":{"any":true},"structuralVariants":null,"includeDriver":true,"includeVUS":true,"includeUnknownOncogenicity":true,"includeUnknownTier":true,"includeGermline":true,"includeSomatic":true,"includeUnknownStatus":true,"tiersBooleanMap":{}}}}'

eugeniomazzone · 2025-02-03T14:41:27Z

Today, I've looked at the tests (haven't done that before). I've added the new function to the tests and I've initialized a MolecularProfileCaseIdentifier with generic names for each sample to make it work. Finally, it's now specified that some function are called twice.

sonarqubecloud · 2025-02-03T15:46:03Z

Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
92.5% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

eugenio added 6 commits January 13, 2025 11:48

Fix frequency issue for same genePanel

1354c03

Rebase to original branch

Add sketchy fix for frequency

0f350d2

Start fix for general freq

ba9b193

Start add logic for freq .1

368c13b

Change lines order

c4b100f

Rebase with working changes

5a157c9

eugeniomazzone marked this pull request as ready for review January 13, 2025 16:20

inodb requested review from alisman and haynescd January 13, 2025 19:55

haynescd reviewed Jan 17, 2025

View reviewed changes

eugenio added 2 commits January 20, 2025 11:54

Remove comment and made function

22078e5

Add some comments

a533e29

fuzhaoyuan self-requested a review January 22, 2025 15:37

dippindots assigned fuzhaoyuan Jan 23, 2025

fuzhaoyuan assigned eugeniomazzone Jan 23, 2025

Modify test code to test for new function/changes

94fbda8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix for frequency in multi-study view #11331

Fix for frequency in multi-study view #11331

eugeniomazzone commented Jan 13, 2025

zeynepkaragoz commented Jan 13, 2025

inodb commented Jan 13, 2025

haynescd Jan 17, 2025

eugeniomazzone Jan 20, 2025 •

edited

Loading

haynescd Jan 17, 2025

eugeniomazzone Jan 20, 2025 •

edited

Loading

fuzhaoyuan commented Jan 22, 2025 •

edited

Loading

alisman commented Jan 24, 2025 •

edited

Loading

eugeniomazzone commented Feb 3, 2025 •

edited

Loading

sonarqubecloud bot commented Feb 3, 2025

Fix for frequency in multi-study view #11331

Are you sure you want to change the base?

Fix for frequency in multi-study view #11331

Conversation

eugeniomazzone commented Jan 13, 2025

zeynepkaragoz commented Jan 13, 2025

inodb commented Jan 13, 2025

haynescd Jan 17, 2025

Choose a reason for hiding this comment

eugeniomazzone Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

haynescd Jan 17, 2025

Choose a reason for hiding this comment

eugeniomazzone Jan 20, 2025 • edited Loading

Choose a reason for hiding this comment

fuzhaoyuan commented Jan 22, 2025 • edited Loading

alisman commented Jan 24, 2025 • edited Loading

eugeniomazzone commented Feb 3, 2025 • edited Loading

sonarqubecloud bot commented Feb 3, 2025

Quality Gate passed

eugeniomazzone Jan 20, 2025 •

edited

Loading

eugeniomazzone Jan 20, 2025 •

edited

Loading

fuzhaoyuan commented Jan 22, 2025 •

edited

Loading

alisman commented Jan 24, 2025 •

edited

Loading

eugeniomazzone commented Feb 3, 2025 •

edited

Loading