Contributing to mmlearn

Thanks for your interest in contributing to mmlearn!

To submit PRs, please fill out the PR template along with the PR. If the PR fixes an issue, please include a link to the PR to the issue. Below are some details around important things to consider before contributing to the library. A table of contents also appears below for navigation.

Development Practices
Development Requirements
Coding Guidelines, Formatters, and Checks
Code Documentation
Tests

Development Practices

We use the standard git development flow of branch and merge to main with PRs on GitHub. At least one member of the core team needs to approve a PR before it can be merged into main. As mentioned above, tests are run automatically on PRs with a merge target of main. Furthermore, a suite of static code checkers and formatters are also run on said PRs. These also need to pass for a PR to be eligible for merging into the main branch of the library. Currently, such checks run on python3.9.

Development Requirements

For development and testing, we use Poetry for dependency management. The library dependencies and those for development and testing are listed in the pyproject.toml file. You may use whatever virtual environment management tool that you would like. These include conda, poetry itself, and virtualenv.

The easiest way to create and activate a virtual environment is by using the virtualenv package:

virtualenv "ENV_PATH"
source "ENV_PATH/bin/activate"
pip install --upgrade pip poetry
poetry install --with "dev, test" --all-extras

Note that the with command is installing all libraries required for the full development workflow. See the pyproject.toml file for additional details as to what is installed with each of these options.

If you need to update the environment libraries, you should change the requirements in the pyproject.toml and then update the poetry.lock using the command poetry update.

Coding Guidelines, Formatters, and Checks

For code style, we recommend the PEP 8 style guide.

We use ruff for code formatting and static code analysis. Ruff checks various rules including flake8. The pre-commit hooks show errors which you need to fix before submitting a PR.

Last but not least, we use type hints in our code which are checked using mypy. The mypy checks are strictly enforced. That is, all mypy checks must pass or the associated PR will not be merge-able.

The settings for mypy and ruff can be found the pyproject.toml files and some standard checks are defined directly in the .pre-commit-config.yaml settings.

All of these checks and formatters are invoked by pre-commit hooks. These hooks are run remotely on GitHub. In order to ensure that your code conforms to these standards, and, therefore, passes the remote checks, you can install the pre-commit hooks to be run locally. This is done by running (with your environment active)

pre-commit install

To run the checks, some of which will automatically re-format your code to fit the standards, you can run

pre-commit run --all-files

It can also be run on a subset of files by omitting the --all-files option and pointing to specific files or folders.

If you're using VS Code for development, pre-commit should setup git hooks that execute the pre-commit checks each time you check code into your branch through the integrated source-control as well. This will ensure that each of your commits conform to the desired format before they are run remotely and without needing to remember to run the checks before pushing to a remote. If this isn't done automatically, you can find instructions for setting up these hooks manually online.

Code Documentation

For code documentation, we try to adhere to the numpy format. For development, any non-trivial or non-obvious methods added to the library should have a doc string. For our library this applies only to code added to the main library in mmlearn. Code outside the core library folder, such as tests, need not incorporate the strict rules of documentation, though clarifying and helpful comments in that code is also strongly encouraged.

NOTE: As a matter of convention choice, classes are documented at the "class" level rather than through their __init__ functions.

If you are using VS Code a very helpful integration is available to facilitate the creation of properly formatted doc-strings called autoDocstring VS Code Page and Documentation. This tool will automatically generate a docstring template when starting a docstring with triple quotation marks ("""). To get the correct format, the following settings should be prescribed in your VS Code settings JSON:

{
    "autoDocstring.customTemplatePath": "",
    "autoDocstring.docstringFormat": "numpy",
    "autoDocstring.generateDocstringOnEnter": true,
    "autoDocstring.guessTypes": true,
    "autoDocstring.includeExtendedSummary": false,
    "autoDocstring.includeName": false,
    "autoDocstring.logLevel": "Info",
    "autoDocstring.quoteStyle": "\"\"\"",
    "autoDocstring.startOnNewLine": true
}

Tests

All tests for the library are housed in the tests folder. The unit and integration tests are run using pytest. These tests are automatically run through GitHub integrations on PRs to the main branch of this repository. PRs that fail any of the tests will not be eligible to be merged until they are fixed.

To run all tests in the tests folder one only needs to run (with the venv active)

pytest .

To run a specific test with pytest, one runs

pytest tests/datasets/test_combined_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing to mmlearn

Development Practices

Development Requirements

Coding Guidelines, Formatters, and Checks

Code Documentation

Tests

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to mmlearn

Development Practices

Development Requirements

Coding Guidelines, Formatters, and Checks

Code Documentation

Tests