Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More convenient syntax for getting the size of a block #369

Closed
mtfishman opened this issue Mar 30, 2024 · 5 comments
Closed

More convenient syntax for getting the size of a block #369

mtfishman opened this issue Mar 30, 2024 · 5 comments

Comments

@mtfishman
Copy link
Collaborator

mtfishman commented Mar 30, 2024

New syntax proposal for obtaining the size of a block

EDIT: TLDR: This proposal is to change the definition of blocksizes so that it acts like this:

julia> using BlockArrays

julia> a = BlockArray(randn(5, 5), [2, 3], [2, 3])
2×2-blocked 5×5 BlockMatrix{Float64}:
  0.916747   1.451410.740126    0.663479  0.17119 
 -0.563452   0.5749730.618916   -1.43895   1.40107 
 ──────────────────────┼─────────────────────────────────
  0.621363   2.863330.0611178  -0.928277  1.48074 
  0.106673  -0.673614-1.49219     0.158793  0.564433
  0.017256  -1.108240.768847    0.923508  0.362625

julia> blocksizes(a)[1, 2]
(2, 3)

julia> blocksizes(a)
2×2 BlockSizes{2, BlockMatrix{Float64, Matrix{Matrix{Float64}}, Tuple{BlockedUnitRange{Vector{Int64}}, BlockedUnitRange{Vector{Int64}}}}}:
 (2, 2)  (2, 3)
 (3, 2)  (3, 3)

i.e. it acts like a collection of size blocksize(a), and indexing into it outputs the size of the corresponding block.

The current definition gives:

julia> blocksizes(a)
([2, 3], [2, 3])

which is equivalent to:

julia> blocklengths.(axes(a))
([2, 3], [2, 3])

so uses of the current blocksizes definition could switch to that, which I think is clearer anyway. See also the discussion in #255.

Summary of the current syntax for obtaining the size of a block

From what I can tell, these are the most compact ways of getting the size of a block right now, using public APIs:

julia> using BlockArrays

julia> a = BlockArray(randn(5, 5), [2, 3], [2, 3])
2×2-blocked 5×5 BlockMatrix{Float64}:
  0.916747   1.451410.740126    0.663479  0.17119 
 -0.563452   0.5749730.618916   -1.43895   1.40107 
 ──────────────────────┼─────────────────────────────────
  0.621363   2.863330.0611178  -0.928277  1.48074 
  0.106673  -0.673614-1.49219     0.158793  0.564433
  0.017256  -1.108240.768847    0.923508  0.362625

julia> size(view(a, Block(1, 2)))
(2, 3)

julia> size(@view(a[Block(1, 2)]))
(2, 3)

julia> getindex.(blocksizes(a), Int.(Tuple(Block(1, 2))))
(2, 3)

julia> getindex.(blocksizes(a), Int.((Block(1), Block(2))))
(2, 3)

julia> getindex.(blocksizes(a), (1, 2))
(2, 3)

julia> length.(getindex.(axes(a), Tuple(Block(1, 2))))
(2, 3)

julia> getindex.(blocklengths.(axes(a)), Int.(Tuple(Block(1, 2))))
(2, 3)

(please correct me if I am wrong).

I think it would be nice to have something more convenient. The best I can come up with is overloading Base.size(a::AbstractArray, b::Block), for example:

julia> Base.size(a::AbstractArray{<:Any,N}, b::Block{N}) where {N} = size(@view(a[b]))

julia> size(a, Block(1, 2))
(2, 3)

It feels like a slight abuse of Base.size, but seems along the same lines as being able to ask for the size in a certain dimension with size(a, 1).

I believe this would make sense for getting the axes of a block as well, i.e. axes(a, Block(1, 2)), however perhaps there is some ambiguity there if that is meant to be a slice of the axes of a, i.e.:

julia> getindex.(axes(a), Tuple(Block(1, 2)))
(1:2, 3:5)

or the axes of the @view(a[Block(1, 2)]), i.e.:

julia> only.(axes.(getindex.(axes(a), Tuple(Block(1, 2)))))
(Base.OneTo(2), Base.OneTo(3))

It is a bit unfortunate that blocksize is already taken and has a different meaning, I understand why that choice was made but I found that to be confusing at first and my initial thought was that it should be a way to get the size of a block.

@jishnub
Copy link
Member

jishnub commented Mar 30, 2024

There's also blocksizes which returns the sizes of all blocks, so indexing into it would return the size of a specific block.

However, I agree with you in general that something like this would have been convenient

@mtfishman
Copy link
Collaborator Author

mtfishman commented Mar 30, 2024

Oops, you're right, I forgot about blocksizes. I'll add that to the first post.

However, from what I can tell the most compact way to get the size of a block using blocksizes is:

julia> blocksizes(a)
([2, 3], [2, 3])

julia> getindex.(blocksizes(a), Int.(Tuple(Block(1, 2))))
(2, 3)

(say if you have a Block{N}) which is still pretty long and requires thought to get it right.

@mtfishman
Copy link
Collaborator Author

mtfishman commented Mar 30, 2024

An alternative to size(a::AbstractArray, b::Block) could be an iterator object, such as:

julia> struct BlockSizes{N,A<:AbstractArray{<:Any,N}} <: AbstractArray{NTuple{N,Int},N}
         array::A
       end

julia> Base.getindex(a::BlockSizes, index::Integer...) = size(@view(a.array[Block(index)]))

julia> Base.size(a::BlockSizes) = blocksize(a.array)

julia> BlockSizes(a)[1, 2]
(2, 3)

EDIT: In the end, I think this is my favorite one, because of the analogy with the current blocklengths(a::AbstractUnitRange) function. It is too bad that blocksizes is already used and has an alternative definition. If blocksizes(a::AbstractArray) was redefined to output BlockSizes(a), the interface could be:

julia> blocksizes(a::AbstractArray) = BlockSizes(a)

julia> blocksizes(a)[1, 2]
(2, 3)

julia> blocksizes(a)
2×2 BlockSizes{2, BlockMatrix{Float64, Matrix{Matrix{Float64}}, Tuple{BlockedUnitRange{Vector{Int64}}, BlockedUnitRange{Vector{Int64}}}}}:
 (2, 2)  (2, 3)
 (3, 2)  (3, 3)

which has a nice correspondence with calling blocklengths on the axes and can be thought of as a shorthand for that:

julia> getindex.(blocklengths.(axes(a)), (1, 2))
(2, 3)

I would favor dropping the current blocksizes(a::AbstractArray) definition, which I think is a bit strange, in favor of this new one. An alternative could be to just use BlockSizes(a) as the API, or define a new function such as eachblocksize(a) = BlockSizes(a) as an API. eachblocksize seems reasonable since it can be seen as an extension of eachblock, and an iterator version of size.(eachblock(a)):

julia> size.(eachblock(a))
2×2 Matrix{Tuple{Int64, Int64}}:
 (2, 2)  (2, 3)
 (3, 2)  (3, 3)

A more generic blocklengths(a::AbstractArray) could be designed in a similar way:

julia> struct BlockLengths{N,A<:AbstractArray{<:Any,N}} <: AbstractArray{Int,N}
         array::A
       end

julia> Base.getindex(a::BlockLengths, index::Integer...) = length(@view(a.array[Block(index)]))

julia> Base.size(a::BlockLengths) = blocksize(a.array)

julia> BlockArrays.blocklengths(a::AbstractArray) = BlockLengths(a)

julia> blocklengths(a)[1, 2]
6

julia> blocklengths(a)
2×2 BlockLengths{2, BlockMatrix{Float64, Matrix{Matrix{Float64}}, Tuple{BlockedUnitRange{Vector{Int64}}, BlockedUnitRange{Vector{Int64}}}}}:
 4  6
 6  9

with an alternative syntax eachblocklength, and additionally blockaxeses:

julia> struct BlockAxeses{N,A<:AbstractArray{<:Any,N}} <: AbstractArray{Tuple,N}
         array::A
       end

julia> blockaxeses(a::AbstractArray) = BlockAxeses(a)
blockaxes (generic function with 1 method)

julia> Base.getindex(a::BlockAxeses, index::Integer...) = axes(@view(a.array[Block(index)]))

julia> Base.size(a::BlockAxeses) = blocksize(a.array)

julia> blockaxeses(a)[1, 2]
(Base.OneTo(2), Base.OneTo(3))

julia> blockaxeses(a)
2×2 BlockAxeses{2, BlockMatrix{Float64, Matrix{Matrix{Float64}}, Tuple{BlockedUnitRange{Vector{Int64}}, BlockedUnitRange{Vector{Int64}}}}}:
 (Base.OneTo(2), Base.OneTo(2))  (Base.OneTo(2), Base.OneTo(3))
 (Base.OneTo(3), Base.OneTo(2))  (Base.OneTo(3), Base.OneTo(3))

blockaxeses is clearly unfortunate but is to disambiguate from blockaxes. Apparently the plural of axes is axes, so there isn't an obvious choice for that (mwaskom/seaborn#599 (comment)), axess could be an alternative. If we went with eachblocksize instead of blocksizes to avoid conflict with the current blocksizes function, it could be eachblockaxes.

@mtfishman
Copy link
Collaborator Author

mtfishman commented Mar 30, 2024

Another idea would be to define a generic function viewsize(a::AbstractArray, indices...):

julia> viewsize(a::AbstractArray, indices...) = size(view(a, indices...))
viewsize (generic function with 1 method)

julia> viewsize(a, Block(1, 2))
(2, 3)

though maybe that isn't much better than size(view(a, Block(1, 2))). A better name might be subsize since in principle it doesn't have to do with a view, even if it is implemented that way.

EDIT: Some other ideas are blockedsize(a::AbstractArray, index...) or getblocksize(a::AbstractArray, index...), i.e.:

julia> blockedsize(a::AbstractArray, index...) = size(view(a, Block(index)))
blockedsize (generic function with 1 method)

julia> blockedsize(a, 1, 2)
(2, 3)

julia> getblocksize(a::AbstractArray, index...) = size(view(a, Block(index)))
getblocksize (generic function with 1 method)

julia> getblocksize(a, 1, 2)
(2, 3)

But I think I like the suggestion for redefining blocksizes(a::AbstractArray) based on an iterator BlockSizes(a::AbstractArray) that iterates over the size of each block, as discussed in the previous post best, which I think is a compelling generalization of the existing blocklengths(a::AbstractUnitRange) function.

This was referenced May 7, 2024
@mtfishman
Copy link
Collaborator Author

Closed by #399.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants