Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug with cluster_sel_epsilon option. Rounded to 0 #95

Open
aringeri opened this issue Aug 1, 2024 · 0 comments
Open

Potential bug with cluster_sel_epsilon option. Rounded to 0 #95

aringeri opened this issue Aug 1, 2024 · 0 comments

Comments

@aringeri
Copy link

aringeri commented Aug 1, 2024

Hi, I've been reading through the implementation of NanoCLUST to get some ideas for clustering my own nanopore data.

I have found a potential bug with the cluster_sel_epsilon option at the hdbscan clustering step.
The line here:

umap_out["bin_id"] = hdbscan.HDBSCAN(min_cluster_size=int($params.min_cluster_size), cluster_selection_epsilon=int($params.cluster_sel_epsilon)).fit_predict(X)

int($params.cluster_sel_epsilon)

My understanding is that the int() function in python will round down fractions to the nearest integer. So although the params.cluster_sel_epsilon may be set to 0.5, int(0.5) will become 0.

This may be misleading if anyone attempt to configure the params.cluster_sel_epsilon option as it will always be rounded down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant