Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

CUB 1.5.2

Compare
Choose a tag to compare
@brycelelbach brycelelbach released this 19 May 08:37

Summary

CUB 1.5.2 enhances cub::CachingDeviceAllocator and improves scan performance for SM5x (Maxwell).

Enhancements

  • Improved medium-size scan performance on SM5x (Maxwell).
  • Refactored cub::CachingDeviceAllocator:
    • Now spends less time locked.
    • Uses C++11's std::mutex when available.
    • Failure to allocate a block from the runtime will retry once after freeing cached allocations.
    • Now respects max-bin, fixing an issue where blocks in excess of max-bin were still being retained in the free cache.

Bug fixes:

  • Fix for generic-type reduce-by-key cub::WarpScan for SM3x and newer GPUs.