Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding installation guide. #115

Merged
merged 7 commits into from
May 21, 2024
Merged

Adding installation guide. #115

merged 7 commits into from
May 21, 2024

Conversation

cmungall
Copy link
Member

@cmungall cmungall commented May 7, 2024

Fixes #114

@ielis ielis self-requested a review May 8, 2024 21:01
Copy link
Member

@ielis ielis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cmungall

thank you very much for the PR!

If you do not mind, I changed the .venv to venv (I somewhat dislike the hidden folders and we do not track venv in .gitignore) and I recommend developers to upgrade pip before the installation.

Otherwise it looks good, thanks again!

@ielis
Copy link
Member

ielis commented May 8, 2024

@pnrobinson the CI action is failing because the notebook uses a relative path to a HPO JSON instead of the OntologyStore API of HPO toolkit (see here).

Please use the OntologyStore everytime you work with HPO. It would probably be better to have pyphetools.creation.TemplateImporter take hpotk.MinimalOntology instead of a str.

@cmungall
Copy link
Member Author

cmungall commented May 8, 2024

Thanks! No strong opinion on the dot (I am a poetry-head myself 😄 )

And yes I was wondering about the failing action

@pnrobinson
Copy link
Member

Adjust

hp_json = "../hp.json"
created_by="ORCID:0000-0002-0736-9199"
timporter = TemplateImporter(template=template, hp_json=hp_json, created_by=created_by)

in example file so that we do not need the local hp.json

@pnrobinson
Copy link
Member

I updated the templates in phenopacket store.
@ielis One issue is that the templates reach out to the VariantValidator API for many variants. Sometimes there is a network issue and one needs to simply rerun the cell. This can cause the CI to fail even though the code is correct. Not sure if we want to keep this in the default CI or make this action optional.

@pnrobinson
Copy link
Member

@ielis
It is probably better to put the CI tests in this repo -- for phenopacket-store, after we have generated the phenopackets, there is no need to rerun the scripts unless we notice a mistake. Also, in this repository we can check in pickled versions of the variant-validator response, which will avoid network issues. Was there any reason to do this via phenopacket-store or would this also be OK?

@ielis
Copy link
Member

ielis commented May 21, 2024

@ielis One issue is that the templates reach out to the VariantValidator API for many variants. Sometimes there is a network issue and one needs to simply rerun the cell.

Yeah, this is a pain and in general we should avoid depending on network for the tests to pass. In other projects, we mark the network-dependent tests with a special annotation @pytest.mark.online.

For instance:

@pytest.mark.online
@pytest.mark.parametrize(
    'tx_id, contig, start, end, strand',
    [
        # ANKRD11
        ('NM_013275.6', '16', 847_784, 1_070_716, Strand.NEGATIVE),
        # MAPK8IP3
        ('NM_001318852.2', '16', 1_706_194, 1_770_351, Strand.POSITIVE),
    ]
)
def test_fetch__end_to_end(
        self,
        coordinate_service: VVTranscriptCoordinateService,
        tx_id: str,
        contig: str, start: int, end: int, strand: Strand,
):
    tx_coordinates = coordinate_service.fetch(tx_id)
    assert tx_coordinates.identifier == tx_id
    # And more assertions follow...

The tested coordinate_service pings Variant Validator's REST API and, therefore, is not run unless we invoke pytest with --runonline flag:

python --runonline

The --runonline flag includes the tests annotated with @pytest.mark.online in the test suite.

We can do something similar here to remove spurious failures, and we will only run the online tests now and then. Note, a special setup must be done in order to use --runonline. @pnrobinson let me know if you'd like me to do that here.


... Also, in this repository we can check in pickled versions of the variant-validator response, which will avoid network issues. ...

I would favor json instead of pickle because JSON files are easier to update when the VV format changes. I've done something similar elsewhere, so in principle it can be done. We should, however, decide if we want to use @pytest.mark.online annotation before addressing this point.


... It is probably better to put the CI tests in this repo -- for phenopacket-store, after we have generated the phenopackets, there is no need to rerun the scripts unless we notice a mistake.

I am OK with deleting the generate_phenopackets.yml action, in order to skip running the phenopacket-store notebooks as part of pyphetools' CI. @pnrobinson pls remove the action definition YAML or let me know - I can do it too.

@pnrobinson
Copy link
Member

I think it is probably OK to run the tests locally. We do not need to worry about "old" notebooks or whatever, because we only re-run the phenopackets if there was a mistake in the data. I think it is probably best to remove the CI for now. We will need to rethink the entire pipeline soon, in order to rethink the HPO annotation pipeline, and I think a lot of stuff needs to be done, but we are testing the phenoapckets with ppt etc etc. @ielis

@ielis
Copy link
Member

ielis commented May 21, 2024

Then I will drop the notebook action and I will deal with the network-dependent tests in #117

@ielis ielis merged commit 21dc084 into main May 21, 2024
3 checks passed
@ielis ielis deleted the issue-114 branch May 21, 2024 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add installation instructions
3 participants