This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
CUB 0.9.3
Summary
CUB 0.9.3 adds histogram algorithms and work management utility descriptors.
New Features
cub::DevicHistogram256
.cub::BlockHistogram256
.cub::BlockScan
algorithm variantBLOCK_SCAN_RAKING_MEMOIZE
, which trades more register consumption for less shared memory I/O.cub::GridQueue
,cub::GridEvenShare
, work management utility descriptors.
Other Enhancements
- Updates to
cub::BlockRadixRank
to usecub::BlockScan
, which improves performance on SM3x by using SHFL. - Allow types other than builtin types to be used in
cub::WarpScan::*Sum
methods if they only haveoperator+
overloaded. Previously they also required to support assignment fromint(0)
. - Update
cub::BlockReduce
'sBLOCK_REDUCE_WARP_REDUCTIONS
algorithm to work even when block size is not an even multiple of warp size. - Refactoring of
cub::DeviceAllocator
interface andcub::CachingDeviceAllocator
implementation.