Skip to content

Commit

Permalink
Add PacketOps support for textures
Browse files Browse the repository at this point in the history
Texture channel data is packed contiguously, so leverage PacketOps for gathers (and scatters) that show substantial speed-ups (particularly on the LLVM backend).

This is especially true for half-precision textures, where the absence of FP16 SIMD gather instructions results in bloated compiled kernels and a degradation in performance relative to the single-precision counterpart. PacketOps circumvents this issue by avoiding the use of gather intrinsics entirely.
  • Loading branch information
rtabbara authored and njroussel committed Feb 6, 2025
1 parent b656011 commit a1e4cf5
Show file tree
Hide file tree
Showing 2 changed files with 289 additions and 141 deletions.
Loading

0 comments on commit a1e4cf5

Please sign in to comment.