This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
CUB 1.5.0
CUB 1.5.0
CUB 1.5.0 introduces segmented sort and reduction primitives.
New Features:
- Segmented device-wide operations for device-wide sort and reduction primitives.
Bug Fixes:
- #36:
cub::ThreadLoad
generates compiler errors when loading from pointer-to-const. - #29:
cub::DeviceRadixSort::SortKeys<bool>
yields compiler errors. - #26: Misaligned address after
cub::DeviceRadixSort::SortKeys
. - #25: Fix for incorrect results and crashes when radix sorting 0-length problems.
- Fix CUDA 7.5 issues on SM52 GPUs with SHFL-based warp-scan and warp-reduction on non-primitive data types (e.g. user-defined structs).
- Fix small radix sorting problems where 0 temporary bytes were required and users code was invoking
malloc(0)
on some systems where that returnsNULL
. CUB assumed the user was asking for the size again and not running the sort.