Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Texture channel data is packed contiguously, so leverage PacketOps for gathers (and scatters) that show substantial speed-ups (particularly on the LLVM backend). This is especially true for half-precision textures, where the absence of FP16 SIMD gather instructions results in bloated compiled kernels and a degradation in performance relative to the single-precision counterpart. PacketOps circumvents this issue by avoiding the use of gather intrinsics entirely.
- Loading branch information