Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to new knncolle libraries with OpenMP/thread-based parallelization. #25

Merged
merged 35 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
9f29237
Stripped out the vendored libraries in favor of assorthead.
LTLA Jul 26, 2024
38c62de
Begin overhauling the C++ code.
LTLA Jul 26, 2024
7970695
Continue the revolution.
LTLA Jul 26, 2024
bae7c4c
Try to add OpenMP support if it's available.
LTLA Jul 26, 2024
f7b2125
Restored Cosine support.
LTLA Jul 27, 2024
0eee7d2
Refactored all of the R code.
LTLA Jul 27, 2024
790e6c7
Cleaned up documentation for the distances.
LTLA Jul 27, 2024
ea8e061
Got all examples working.
LTLA Jul 28, 2024
7dc9108
Hack it out to make it easier to extend at the C/R levels.
LTLA Jul 28, 2024
9ad49b1
Initial suite of tests converted to the new world.
LTLA Jul 28, 2024
c5a0859
Fleshed out the remaining tests.
LTLA Jul 28, 2024
86f7dbc
Made it a bit easier to document.
LTLA Jul 28, 2024
18d7a16
Updated the vignettes.
LTLA Jul 29, 2024
448186c
Pass CHECK.
LTLA Jul 29, 2024
b30f923
Updated NEWS, pre-bumped major version.
LTLA Jul 29, 2024
4dc7fca
Exposed the builder's construction, streamlined cosine support.
LTLA Aug 9, 2024
1c0c800
Clean-ups to the docs around external pointers.
LTLA Aug 9, 2024
1abbd00
Added methods for back-compatibility with BNINDEX.
LTLA Aug 9, 2024
cd3c796
Some more warning text about serialization.
LTLA Aug 9, 2024
ec2a6a2
Ease up on the deprecation warnings.
LTLA Aug 9, 2024
53a0798
Check that we handle NULL pointers gracefully.
LTLA Aug 9, 2024
f6308db
Restore Manhattan tests.
LTLA Aug 9, 2024
b4771aa
Generalized C++ code to allow variable 'k', only pick last distance.
LTLA Aug 9, 2024
a5fea20
Further fixes.
LTLA Aug 9, 2024
5353eb0
Docfix.
LTLA Aug 9, 2024
1790986
Added preliminary support for the find/queryDistance methods.
LTLA Aug 10, 2024
14bd36e
Got rid of some more dead code.
LTLA Aug 10, 2024
0b0fa57
Docfix.
LTLA Aug 10, 2024
f4bb05e
Further editing.
LTLA Aug 10, 2024
9fb555a
Docfixes.
LTLA Aug 10, 2024
72cda36
Further fixes.
LTLA Aug 12, 2024
7b0815c
Bumped the date.
LTLA Sep 5, 2024
b6fc291
Fixed a test.
LTLA Sep 5, 2024
cd30bb2
Don't attempt to add OpenMP support.
LTLA Sep 5, 2024
8016292
Don't attempt OpenMP support as this fails with forking.
LTLA Sep 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 11 additions & 14 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,24 +1,18 @@
Package: BiocNeighbors
Version: 1.21.2
Date: 2022-12-17
Version: 1.99.0
Date: 2024-09-05
Title: Nearest Neighbor Detection for Bioconductor Packages
Authors@R: c(person("Aaron", "Lun", role=c("aut", "cre", "cph"),
email="[email protected]"))
Imports:
Rcpp,
S4Vectors,
BiocParallel,
stats,
methods,
Matrix
methods
Suggests:
BiocParallel,
testthat,
BiocStyle,
knitr,
rmarkdown,
FNN,
RcppAnnoy,
RcppHNSW
rmarkdown
biocViews: Clustering, Classification
Description: Implements exact and approximate methods for nearest neighbor
detection, in a framework that allows them to be easily switched within
Expand All @@ -29,7 +23,10 @@ Description: Implements exact and approximate methods for nearest neighbor
Parallelization is achieved for all methods by using BiocParallel. Functions
are also provided to search for all neighbors within a given distance.
License: GPL-3
LinkingTo: Rcpp, RcppHNSW
LinkingTo:
Rcpp,
assorthead
VignetteBuilder: knitr
SystemRequirements: C++11
RoxygenNote: 7.1.1
SystemRequirements: C++17
RoxygenNote: 7.3.2
Encoding: UTF-8
69 changes: 6 additions & 63 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,96 +1,39 @@
# Generated by roxygen2: do not edit by hand

export(AnnoyIndex)
export(AnnoyIndex_path)
export(AnnoyIndex_search_mult)
export(AnnoyParam)
export(AnnoyParam_directory)
export(AnnoyParam_ntrees)
export(AnnoyParam_search_mult)
export(ExhaustiveIndex)
export(ExhaustiveParam)
export(HnswIndex)
export(HnswIndex_ef_search)
export(HnswIndex_path)
export(HnswParam)
export(HnswParam_directory)
export(HnswParam_ef_construction)
export(HnswParam_ef_search)
export(HnswParam_nlinks)
export(KmknnIndex)
export(KmknnIndex_cluster_centers)
export(KmknnIndex_cluster_info)
export(KmknnParam)
export(KmknnParam_kmeans_args)
export(VptreeIndex)
export(VptreeIndex_nodes)
export(VptreeParam)
export(bndata)
export(bndistance)
export(bnorder)
export(buildAnnoy)
export(buildExhaustive)
export(buildHnsw)
export(buildIndex)
export(buildKmknn)
export(buildVptree)
export(findAnnoy)
export(findExhaustive)
export(findHnsw)
export(defineBuilder)
export(findDistance)
export(findKNN)
export(findKmknn)
export(findMutualNN)
export(findNeighbors)
export(findVptree)
export(queryAnnoy)
export(queryExhaustive)
export(queryHnsw)
export(queryDistance)
export(queryKNN)
export(queryKmknn)
export(queryNeighbors)
export(queryVptree)
export(rangeFindExhaustive)
export(rangeFindKmknn)
export(rangeFindVptree)
export(rangeQueryExhaustive)
export(rangeQueryKmknn)
export(rangeQueryVptree)
exportClasses(AnnoyIndex)
exportClasses(AnnoyParam)
exportClasses(BiocNeighborIndex)
exportClasses(BiocNeighborParam)
exportClasses(ExhaustiveIndex)
exportClasses(ExhaustiveParam)
exportClasses(HnswIndex)
exportClasses(HnswParam)
exportClasses(KmknnIndex)
exportClasses(KmknnParam)
exportClasses(VptreeIndex)
exportClasses(VptreeParam)
exportMethods("[[")
exportMethods("[[<-")
exportMethods(bndata)
exportMethods(bndistance)
exportMethods(bnorder)
exportMethods(buildIndex)
exportMethods(dim)
exportMethods(dimnames)
exportMethods(defineBuilder)
exportMethods(findDistance)
exportMethods(findKNN)
exportMethods(findNeighbors)
exportMethods(queryDistance)
exportMethods(queryKNN)
exportMethods(queryNeighbors)
exportMethods(show)
import(BiocParallel)
import(methods)
importClassesFrom(S4Vectors,character_OR_NULL)
importFrom(BiocParallel,SerialParam)
importFrom(BiocParallel,bpmapply)
importFrom(BiocParallel,bpnworkers)
importFrom(Matrix,t)
importFrom(Rcpp,sourceCpp)
importFrom(S4Vectors,setValidity2)
importFrom(methods,is)
importFrom(methods,new)
importFrom(methods,show)
importFrom(stats,kmeans)
useDynLib(BiocNeighbors)
27 changes: 3 additions & 24 deletions R/AllClasses.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,13 @@ setClass("BiocNeighborParam", contains="VIRTUAL", slots=c(distance="character"))
setClass("ExhaustiveParam", contains="BiocNeighborParam")

#' @export
setClass("KmknnParam", contains="BiocNeighborParam", slots=c(kmeans.args="list"))
setClass("KmknnParam", contains="BiocNeighborParam")

#' @export
setClass("VptreeParam", contains="BiocNeighborParam")

#' @export
setClass("AnnoyParam", contains="BiocNeighborParam", slots=c(ntrees="integer", dir="character", search.mult="numeric"))
setClass("AnnoyParam", contains="BiocNeighborParam", slots=c(ntrees="integer", search.mult="numeric"))

#' @export
setClass("HnswParam", contains="BiocNeighborParam", slots=c(nlinks="integer", ef.construction="integer", dir="character", ef.search="integer"))

# Defines the BiocNeighborIndex class and derivatives.

#' @export
#' @importClassesFrom S4Vectors character_OR_NULL
setClass("BiocNeighborIndex", contains="VIRTUAL", slots=c(data="matrix", NAMES="character_OR_NULL", distance="character"))

#' @export
setClass("ExhaustiveIndex", contains="BiocNeighborIndex")

#' @export
setClass("KmknnIndex", contains="BiocNeighborIndex", slots=c(centers="matrix", info="list", order="integer"))

#' @export
setClass("VptreeIndex", contains="BiocNeighborIndex", slots=c(order="integer", nodes="list"))

#' @export
setClass("AnnoyIndex", contains="BiocNeighborIndex", slots=c(path="character", search.mult="numeric"))

#' @export
setClass("HnswIndex", contains="BiocNeighborIndex", slots=c(path="character", ef.search="integer"))
setClass("HnswParam", contains="BiocNeighborParam", slots=c(nlinks="integer", ef.construction="integer", ef.search="integer"))
65 changes: 35 additions & 30 deletions R/AllGenerics.R
Original file line number Diff line number Diff line change
@@ -1,46 +1,51 @@
#' @export
#' @rdname buildIndex
setGeneric("buildIndex", signature=c("BNPARAM"),
function(X, ..., BNPARAM)
standardGeneric("buildIndex")
)
setGeneric("buildIndex", signature=c("X", "BNPARAM"), function(X, transposed=FALSE, ..., BNPARAM=NULL) standardGeneric("buildIndex"))

#' @export
#' @rdname findKNN-methods
setGeneric("findKNN", signature=c("BNINDEX", "BNPARAM"),
function(X, k, ..., BNINDEX, BNPARAM)
standardGeneric("findKNN")
)
#' @rdname defineBuilder
setGeneric("defineBuilder", signature="BNPARAM", function(BNPARAM) standardGeneric("defineBuilder"))

#' @export
#' @rdname queryKNN-methods
setGeneric("queryKNN", signature=c("BNINDEX", "BNPARAM"),
function(X, query, k, ..., BNINDEX, BNPARAM)
standardGeneric("queryKNN")
)
# This is explicitly a S4 generic so that developers can extend it at the R
# level, not at the C++ level. We need to support dispatch on both X and
# BNPARAM as X could be an arbitrary index structure (i.e., not an external
# pointer). If we only dispatched on BNPARAM, a user could call the method with
# a prebuilt X that doesn't match the BNPARAM. This means that the developer of
# the BNPARAM method would be responsible for figuring out what to do with a X
# that they don't know anything about, which is pretty weird.

#' @export
#' @rdname findNeighbors-methods
setGeneric("findNeighbors", signature=c("BNINDEX", "BNPARAM"),
function(X, threshold, ..., BNINDEX, BNPARAM)
standardGeneric("findNeighbors")
)
#' @rdname findKNN
setGeneric("findKNN", signature=c("X", "BNPARAM"), function(X, k, get.index=TRUE, get.distance=TRUE, num.threads=1, subset=NULL, ..., BNPARAM=NULL) {
standardGeneric("findKNN")
})

#' @export
#' @rdname queryNeighbors-methods
setGeneric("queryNeighbors", signature=c("BNINDEX", "BNPARAM"),
function(X, query, threshold, ..., BNINDEX, BNPARAM)
standardGeneric("queryNeighbors")
)
#' @rdname queryKNN
setGeneric("queryKNN", signature=c("X", "BNPARAM"), function(X, query, k, get.index=TRUE, get.distance=TRUE, num.threads=1, subset=NULL, transposed=FALSE, ..., BNPARAM=NULL) {
standardGeneric("queryKNN")
})

#' @export
setGeneric("bnorder", function(x) standardGeneric("bnorder"))
#' @rdname findNeighbors
setGeneric("findNeighbors", signature=c("X", "BNPARAM"), function(X, threshold, get.index=TRUE, get.distance=TRUE, num.threads=1, subset=NULL, ..., BNPARAM=NULL) {
standardGeneric("findNeighbors")
})

#' @export
setGeneric("bndata", function(x) standardGeneric("bndata"))
#' @rdname queryNeighbors
setGeneric("queryNeighbors", signature=c("X", "BNPARAM"), function(X, query, threshold, get.index=TRUE, get.distance=TRUE, num.threads=1, subset=NULL, transposed=FALSE, ..., BNPARAM=NULL) {
standardGeneric("queryNeighbors")
})

#' @export
setGeneric("bndistance", function(x) standardGeneric("bndistance"))
#' @rdname findDistance
setGeneric("findDistance", signature=c("X", "BNPARAM"), function(X, k, num.threads=1, subset=NULL, ..., BNPARAM=NULL) {
standardGeneric("findDistance")
})

# Generic purely for internal use, to help in defining other S4 methods.
setGeneric("spill_args", function(x) standardGeneric("spill_args"))
#' @export
#' @rdname queryDistance
setGeneric("queryDistance", signature=c("X", "BNPARAM"), function(X, query, k, num.threads=1, subset=NULL, transposed=FALSE, ..., BNPARAM=NULL) {
standardGeneric("queryDistance")
})
97 changes: 0 additions & 97 deletions R/AnnoyIndex-class.R

This file was deleted.

Loading
Loading