Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TASK-6347 Fix normalization issue in ClinVar #693

Merged
merged 7 commits into from
Jun 26, 2024
Merged

Conversation

jtarraga
Copy link
Member

@jtarraga jtarraga commented Jun 21, 2024

TASK-6347 normalization issue in ClinVar

jtarraga added 6 commits June 4, 2024 13:05
… clinical variants, #TASK-6347

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/ClinicalIndexer.java
…nd Gwas, files to download; and update Cosmic version, #TASK-6347

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-core/src/main/resources/configuration.yml
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/CosmicIndexer.java
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/download/ClinicalDownloadManager.java
…on of the GWAS Catalog, #TASK-6347

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-core/src/main/resources/configuration.yml
…sion, #TASK-6347

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/EtlCommons.java
On branch TASK-6347
Changes to be committed:
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/ClinVarIndexer.java
…file; and fix some sonnar issues, #TASK-6347

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/GwasIndexer.java
@jtarraga jtarraga requested a review from j-coll June 21, 2024 11:47
try {
accession = publicSet.getReferenceClinVarAssertion().getClinVarAccession().getAcc();
} catch (Exception e) {
logger.warn("Error getting accession\n" + StringUtils.join(e.getStackTrace(), "\n"));
Copy link
Member

@j-coll j-coll Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better let the logger to print the stacktrace:

logger.warn("Error getting accession", e);

and you can also indicate what will happen with the accession:

logger.warn("Error getting accession. Ignore error and leave accession as null.", e);

This applies to all other fields read in this method

while ((line = inputReader.readLine()) != null) {
++lineCounter;
if (!line.isEmpty()) {
processedGwasLines++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to use a org.opencb.commons.ProgressLogger here

ProgressLogger progressLogger = new ProgressLogger("Lines parsed").setBatchSize(10000);

...
if (!line.isEmpty() ) {
  progressLogger.increment(1);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a comment, no action is required (i.e. it's optional)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's useful, but I can't find a method to get the current number of parsed lines. I need this information to include in the log messages when parsing fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is very inconvenient... there is no way of obtaining that..

On branch TASK-6347
Changes to be committed:
	modified:   cellbase-lib/src/main/java/org/opencb/cellbase/lib/builders/clinical/variant/ClinVarIndexer.java
@jtarraga jtarraga requested a review from j-coll June 25, 2024 14:08
@jtarraga jtarraga merged commit dcbb95c into release-5.8.x Jun 26, 2024
4 checks passed
@jtarraga jtarraga deleted the TASK-6347 branch June 26, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants