Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partial sort functionality? #1

Open
jacob-roth opened this issue Dec 14, 2023 · 7 comments
Open

partial sort functionality? #1

jacob-roth opened this issue Dec 14, 2023 · 7 comments

Comments

@jacob-roth
Copy link

jacob-roth commented Dec 14, 2023

I think this project is really cool! I have an application where I need to perform partial sorting (per here and here) and was trying to adapt your code to handle this case. Would you have any suggestions on where to start? I naively tried to modify ssortperm!! to accept lo and hi based on the desired partial sort indices, e.g.,

function partialssortperm!!(ix::AbstractVector{Int}, v::AbstractVector, idxrange::UnitRange{Int}; lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)
    axes(ix) == axes(v) || throw(ArgumentError("index array must have the same size/axes as the source array, $(axes(ix)) != $(axes(v))"))
    ix .= LinearIndices(v)
    vs = StructArray{Tuple{eltype(v),eltype(ix)}}(val=v, ix=ix)
    o = ord(lt, by, rev ? true : nothing, order)
    lo = idxrange[1] # <- new
    hi = idxrange[end] # <- new
    _sortperm_inplace_small_optimization!(ix, v, vs, lo, hi, o::Ordering)
    ix
end

but I failed to get a correct implementation.

Do you think there's an easy fix? Or do you think it involve digging into the internals of _sortperm_inplace_small_optimization!?

Thanks!
Jake

@adienes
Copy link

adienes commented Jan 27, 2024

I would also be interested in a (very fast) partialsortperm

@LSchwerdt
Copy link
Owner

Sorry for the late answer. Due to a new job and a new baby I have other priorities right now.

The basic inner function _sortperm!!(ix, v, vs, lo, hi, o::Ordering, offsets_l, offsets_r) already accepts lo and hi indices for partial sorting:

function _sortperm!!(ix, v, vs, lo, hi, o::Ordering, offsets_l, offsets_r)

I think the easiest option would be trying to call that from your function at first. This way, you could get the core functionality working and skip dealing with all the code that deals with optimizations for special types.

@adienes
Copy link

adienes commented Feb 3, 2024

fantastic, thanks for the pointer. congratulations on the baby and job!

@jacob-roth
Copy link
Author

jacob-roth commented Feb 23, 2024

Thanks, and congrats!! Is there an easy fix for my below M(not)WE for attempting to sort the first 10 elements of a vector containing 100 elements? Not sure how to use offsets in the following (or if I'm making another mistake as well):

julia> using SimultaneousSortperm
julia> n=100
julia> k=10
julia> x=randn(n);
julia> xsort=similar(x);
julia> ix=collect(1:n);
julia> SimultaneousSortperm._sortperm!!(ix, x, xsort, 1, k, SimultaneousSortperm.Forward, [1], [n])
ERROR: BoundsError: attempt to access Float64 at index [2]

@jacob-roth
Copy link
Author

jacob-roth commented Mar 24, 2024

Hi @adienes, have you managed to use the SimultaneousSortperm._sortperm!! function in a MWE like what I tried above? I would like to obtain a partial sort indices corresponding to the smallest 5 elements of a 10 element array. I've tried the following for hi=5 which doesn't seem to do what I want (but instead sorts the first 5 elements of the original array):

Random.seed!(12345)
n = 10
v0 = randn(n)
ix0 = collect(1:n)
vs0 = StructArray{Tuple{eltype(v),eltype(ix)}}(val=v0, ix=ix0)

lo = 1
hi = 5
o = SimultaneousSortperm.Forward

v = zeros(n)
ix = zeros(Int, n)
v .= v0
ix .= ix0
vs = StructArray{Tuple{eltype(v),eltype(ix)}}(val=v, ix=ix)

offsets_l = zeros(Int, hi - lo + 1) # Adjust the size as needed
offsets_r = zeros(Int, hi - lo + 1) # Adjust the size as needed

SimultaneousSortperm._sortperm!!(ix, v, vs, lo, hi, o, offsets_l, offsets_r)

julia> hcat(vs, vs0)
10×2 StructArray(::Matrix{Float64}, ::Matrix{Int64}) with eltype Tuple{Float64, Int64}:
 (-0.308742, 1)  (-0.308742, 1)
 (-0.106916, 5)  (0.122478, 2)
 (0.122478, 2)   (1.19057, 3)
 (0.133261, 4)   (0.133261, 4)
 (1.19057, 3)    (-0.106916, 5)
 (0.724831, 6)   (0.724831, 6)
 (3.23096, 7)    (3.23096, 7)
 (-0.204223, 8)  (-0.204223, 8)
 (0.244815, 9)   (0.244815, 9)
 (-0.67478, 10)  (-0.67478, 10)

but rerunning with hi=10 provides a check that the calling sequence isn't totally wrong since it performs a full sort here:

lo = 1
hi = 10
o = SimultaneousSortperm.Forward

v = zeros(n)
ix = zeros(Int, n)
v .= v0
ix .= ix0
vs = StructArray{Tuple{eltype(v),eltype(ix)}}(val=v, ix=ix)

offsets_l = zeros(Int, hi - lo + 1) # Adjust the size as needed
offsets_r = zeros(Int, hi - lo + 1) # Adjust the size as needed

SimultaneousSortperm._sortperm!!(ix, v, vs, lo, hi, o, offsets_l, offsets_r)

julia> hcat(vs, vs0)
10×2 StructArray(::Matrix{Float64}, ::Matrix{Int64}) with eltype Tuple{Float64, Int64}:
 (-0.67478, 10)  (-0.308742, 1)
 (-0.308742, 1)  (0.122478, 2)
 (-0.204223, 8)  (1.19057, 3)
 (-0.106916, 5)  (0.133261, 4)
 (0.122478, 2)   (-0.106916, 5)
 (0.133261, 4)   (0.724831, 6)
 (0.244815, 9)   (3.23096, 7)
 (0.724831, 6)   (-0.204223, 8)
 (1.19057, 3)    (0.244815, 9)
 (3.23096, 7)    (-0.67478, 10)

@adienes
Copy link

adienes commented Mar 25, 2024

for hi=5 which doesn't seem to do what I want (but instead sorts the first 5 elements of the original array):

yes, I ran into the same conclusion; that it sorts slices of the input rather than a partialsort. I was not able to put in much time to try to find alternative approaches

@jacob-roth
Copy link
Author

The basic inner function _sortperm!!(ix, v, vs, lo, hi, o::Ordering, offsets_l, offsets_r) already accepts lo and hi indices for partial sorting:

function _sortperm!!(ix, v, vs, lo, hi, o::Ordering, offsets_l, offsets_r)

@LSchwerdt: Would you have a MWE on hand (or easily accessible) for demonstrating how to ensure at least $k$ elements of an $n$-array are sorted ($k &lt; n$)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants