24 Jan 18:27

mr-c

12069d7

SIMDe 0.7.2

Summary

Post v0.7.0 fixes; more portable implementations of neon intrinsics

Details

common: fix SIMDE_FLOAT64_C macro when SIMDE_FLOAT64_TYPE is defined 1d28a5d @rosbif
complex: split complex math out into separate header 0678336 @nemequ
diagnostic: silence a few -Weverything diagnostics on clang < 5 6f8d285 @nemequ

Implementation of NEON intrinsics:

neon/ceq: implement vceq{s_f32,d_f64} f4f42dc @nemequ
neon/abd: trivial formatting fix 0b8c8ca @nemequ
neon/abd: add missing scalar functions 517a613 @nemequ
neon/abs: add vabsd_s64 4091e3e @nemequ
neon/abs: vabsd_s64 wasn't added to GCC until 9.1.0 52051cb @nemequ
neon/add: implement vaddd_s64 and vaddd_u64 03d4d1b @nemequ
neon/cagt: implement vcagt{s_f32,d_f64} 731cf71 @nemequ
neon/c{ge,gt,le,lt}: some improved 64-bit comparisons 97f4dfb @nemequ
neon/ext: work around bug in GCC prior to 9.0 0c29a5f @nemequ
neon/padd: vpadd_f32 was buggy in older clang versions 623cbf7 @nemequ
neon/rnd: add NaN and ties to test suite fa950a2 @nemequ
neon/rndm: initial implementation 5bf93ad @nemequ
neon/rndn: initial implementation 2c624b5 @nemequ
neon/rndp: initial implementation 7f1f499 @nemequ
neon/uqadd: clang prior to 9 used incorrect types for the scalar funcs fa0eca0 @nemequ
neon/uzp1,neon/uzp2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif

x86 intrinsics

SSE*

sse: fix overflow handling for simde_mm_cvt_ss2si a4658d8 @mr-c
sse: add SIMDE_MM_{GET,SET}_FLUSH_ZERO_MODE 340bf13 @nemequ
sse, sse2: add range checks to several conversion functions c3d7abf @nemequ
sse2: update test for simde_mm_set1_epi32 8854ede @nemequ
sse2: fix armv7 NEON implementation for simde_mm_shufflehi_epi16 338dac0 @nemequ
sse2: change some dependencies from SSE to SSE2 c00a0e5 @rosbif
sse2: fix potentially unused variable in loadu functions f43bfed @nemequ
sse2: use void* for destinations of loadu functions 98c63ae @nemequ
sse4.1: check for SHUFFLE_VECTOR before using it in _mm_cvtepu32_epi64 cb73aec @nemequ
sse4.2: some improved 64-bit comparisons 97f4dfb @nemequ

AVX

avx: use void* for destinations of loadu functions 98c63ae @nemequ

AVX512

permutex2var: fix some signed/unsigned mismatch warnings 951caa1 @nemequ
avx512/s{r,l}li: the imm8 paramters should be unsigned ecc388d @nemequ

XOP

xop: initial implementation 6cc0cef @nemequ
xop: add a bunch of NEON implementations b602fbc @nemequ
xop: fix NEON implementation of simde_mm_maccsd_epi16 8d499b5 @nemequ

Testing with Docker/Podman & CI

docker: add gdb and valgrind to installed packages 4500040 @nemequ
ci: move icc build from Travis to GitHub Actions 712f01a @nemequ
gh-actions: run on pull requests 43e7053 @mr-c
drone: re-organize drone builds 73fe36a @nemequ
drone: adjust branch triggers 9eba966 @nemequ
README: update CI information ca440ae @nemequ
circleci: add Circle CI 5d5350c @nemequ
circleci: actually build in 32-bit mode 4267926 @nemequ
cirrus: add Cirrus CI support 0212a07 @nemequ
cirrus: run asan/ubsan instead of just another GCC build a1c9f1d @nemequ
docker: allow for an optional persistent build directory 610fa3d @nemequ
gh-actions, semaphore: move GCC and clang builds to Semaphore 49d0d82 @nemequ
ci: disable ci/* builds for various providers 28f8775 @nemequ
travis: disable all builds 687851b @nemequ

Misc

cmake: don't explicitly list source files in the x86 directory 88c6f7e @nemequ
meson: link to libm if available 251bc0d @nemequ
simde-align: allow alignment > 8 on MSVC ≥ 19.16 (VS 2017) 0968271 @jsbache
README: fix a couple of outdated links 6001182 @nemequ

Assets 3

27 Dec 12:37

mr-c

v0.7.0

f68981d

SIMDe 0.7.0

Version 0.7.0 Summary

Portable implementation of the NEON intrinsics: 57% finished
Some more WASM implementations of x86 intrinsics
Various SSE*, AVX*, and SVML enhancements
Various new and improved implementations for AltiVec, Neon, POWER architectures.
The "new" SSE2 _mm_{load,store}u_si{16,32,64} intrinsics are now implemented along with the SSE _MM_HINT_* defines.
All of the CLMUL intrinsics have been implemented. "CLMUL_instruction_set" Wikipedia; CLMUL @ Intel Intrinsics Guide.

Please see the 0.7-rc-1 and 0.7.0-rc2 release notes for more details.

Changes since 0.7.0-rc2

Implementation of NEON intrinsics:

neon/orn: add AVX-512VL (ternarylogic) implementations d667aa8 @nemequ
neon/ld3, neon/ld4: disable -Wmaybe-uninitialized on GCC eaaa71f @nemequ

x86 intrinsics

SSE*

sse: cast _MM_HINT_* values to enum _mm_hint on GCC 3f7e6f7 @nemequ

AVX512

avx512/permutex2var: add remaining intrinsics and translations 5d8d9d2

Misc

math: add modf 580e401 @nemequ

Cleanups of SIMDE_BUG_* definitions e090746 @mr-c

Assets 3

22 Dec 11:20

mr-c

v0.7.0-rc2

76790a9

SIMDe v0.7.0-rc2 Pre-release

Pre-release

Summary

2 issues found in SIMDe v0.7-rc-1 via testing on Debian Experimental on the Debian release architectures (amd64, arm64, armel, armhf, i386, mips64el, mipsel, ppc64el, s390x) have been fixed.

Various new and improved implementations for AltiVec, Neon, POWER architectures.

The "new" SSE2 _mm_{load,store}u_si{16,32,64} intrinsics are now implemented along with the SSE _MM_HINT_* defines.

All of the x86 CLMUL intrinsics have been implemented Wikipedia Intel Intrinsics Guide.

Details

Implementation of NEON intrinsics:

neon/cnt: _vcntq_s8 & _vcntq_u8, add AltiVec implementations 1d56b8c @nemequ
neon/shr_n: _vshrq_n_s8, avoid shift-negative-value diagnostics 26aeda4 @rosbif
neon/bic: _vbicq_s8 & _vbicq_s64, correct PPC implementations 2779ba0 @rosbif
neon/ld3: disable -Wmaybe-uninitialized on GCC < 10 c97093f @nemequ
neon/ld3: load entire vectors sequentially 4097372 @nemequ
neon/bsl, neon/mvn: use ternary logic on AVX-512VL 1660b73 @nemequ

SVML

svml: add fallbacks on shorter functions to div/rem/hypot/erfc (#598) 9199002 @himanshi18037

x86 intrinsics

features: GFNI needs <immintrin.h> 80a2e3d @rosbif

SSE*

sse: correct POWER versions in _mm_cmpunord_ps, add POWER6 version. 2b851a5 @rosbif
sse: correct PPC P5 to P6 in _mm_store_ps f889439 @rosbif
sse: include _MM_HINT_* defines, test for _mm_prefetch 6b2a873 @mr-c @nemequ
sse: added NEON impl for _mm_shuffle_ps @masterchef2209 1777224
sse: work around missing vrndiq_f32 on GCC on armv8 with NEON b56248b @nemequ

sse, sse2: use ternary logic on AVX-512VL for NOT functions 97ac0a5 @nemequ

sse2: fix rounding of _mm_cvtps_epi32 on POWER on clang 0e60b5f @nemequ
sse2: implement the new instructions _mm_{load,store}u_si{16,32,64} b7f467f @nemequ
sse2: added NEON impl for _mm_shuffle_epi32, _mm_shuffle{lo,hi}_epi16 8525eba _mm_mul_su32 5102af0 _mm_cvtsd_f64 6800867 @masterchef2209

sse4.1: regenerate _mm_dp_ps test to avoid rare rounding difference 8358e3c @nemequ

AVX / AVX2

Normalize SIMDE_NATURAL_VECTOR_SIZE usage 98213b3 @mr-c

AVX512

avx512/test: implement _mm512{,_mask}_test_epi{8,16,32,64}_mask ab6c230 @rosbif
avx512/kshift: implement _kshift[lr]i_mask{8,16,32,64} 6bf0dfd @rosbif
avx512/shuffle: implement _mm512_{,mask_,maskz_}shuffle_[fi]{32x4,64x2} e5352c3 @rosbif
Add defines for AVX512VBMI 11c88e2 @rosbif
avx512/permutexvar: add _mm512_{,mask_,maskz_}permutexvar_epi{8,16} _mm512_{,mask_,mask2_,maskz_}permutex2var_epi{8,16} intrinsics b341db7 35c0e5d @rosbif
avx512/permutexvar: many AVX, SSE, NEON, PPC, and WASM implementations c2aa66b @rosbif
avx512/permutexvar: add 128- and 256-bit intrinsics and translations 7ff4af6 @rosbif

CLMUL

All CLMUL intrinsics implemented including _mm_clmulepi64_si128 7ced766 @nemequ
don't use __builtin_shufflevector on XLC 52848ad @nemequ
remove ' && 0' which I accidentally left in place fedae0b @nemequ
work around mscv warning-turned-error 91fe7f4 @mr-c

Testing with Docker/Podman & CI

docker: use an argument for selecting the release eaee500 @nemequ
docker: add crypto and CRC to GCC 10 cross file ca05a1f @nemequ
docker: replace clang-8 cross file with one for clang-11 c326808 @nemequ

Misc

meson: bump version to 0.7.0-rc.1 ed4d5a0 @mr-c
CONTRIBUTING: switch documentation from CMake to Meson. 15f0e24 @nemequ
drone: use Ubuntu instead of Fedora for AArch64 build c5945ca @nemequ
update icc package name for oneapi gold release 820f684 @rscohn2
Document minimum GCC version for -fopenmp-simd 01c7aeb @mr-c
GitHub Actions CI: adjust macOS versions ad6e881 @mr-c

Assets 2

21 Nov 14:14

mr-c

v0.7.0-rc-1

7a2f504

v0.7.0-rc-1 Pre-release

Pre-release

Summary

Portable implementation of the NEON intrinsics: 57% finished
Some more WASM implementations of x86 intrinsics
Various SSE*, AVX*, and SVML enhancements

Details

Implementation of NEON intrinsics:

neon/min: correctly handle (and test) NaNs 07d3a1f @nemequ
neon/zip1: add MMX/SSE, AltiVec, and shuffle vector implementations 56b9205 @nemequ
neon/zip2: add AltiVec, SSE, shuffle vector, etc. implementations f7f36e0 @nemequ
neon/uzp1, neon/uzp2: add AltiVec, SSE, shuffle, etc. implementations 7bcfd75 @nemequ
neon/shl: use SIMDE_POWER_ALTIVEC_BOOL instead of bool aadf0ff @nemequ
neon/addv: initial implementation 49681b6 @nemequ
neon/aba: initial implementation 22c27ec @nemequ
neon/abdl: initial implementation 84c2167 @nemequ
neon/addlv: initial implementation 6b17af2 @nemequ
neon/bic: initial implementation 76d755c @nemequ
neon/bic: add x86, WASM, and AltiVec implementations 9379e5c @nemequ
neon/cnt: initial implementation b15352c @nemequ
neon/hadd: initial implementation 5da4667 @nemequ
neon/hsub: initial implementation 19454d3 @nemequ
neon/maxv: initial implementation a5522ba @nemequ
neon/minv: initial implementation d241170 @nemequ
neon/mls: initial implementation 08a3957 @nemequ
neon/mlsl: initial implementation fd2d782 @nemequ
neon/mull_high: initial implementation c50c836 @nemequ
neon/mlsl_high: initial implementation 93276e0 @nemequ
neon/rbit: add GFNI implementations of vrbit functions fad5a93 @nemequ
neon/dup_lane: initial implementation 2a063f1 @nemequ
neon/orn: initial implementation d788736 @nemequ
neon/bic: fix search & replace error in license 6a1664c @nemequ
neon/qneg: initial implementation 93d6999 @nemequ
neon/maxnm: initial implementation 928834a @nemequ
neon/max: add NaN tests, fix implementations 0d69e18 @nemequ
neon/minv: fix NaN handling, add relevant tests 73044a5 @nemequ
neon/qadd: add scalar functions and the tests to go with them 25f398c @nemequ
neon/qabs: initial implementation fc38506 @nemequ
neon/qneg: add scalar functions and tests 1bf6283 @nemequ
neon/clz: initial implementation c8d74a5 @nemequ
neon/clz: add GFNI implementation of 8x8 functions 7fd22a9 @nemequ
neon/minnm: initial implementation fbd0fd0 @nemequ
neon/uzp1, neon/uzp2: add vuzp{,q}_* implementations for armv7 1d09549 @nemequ
neon/subw: initial implementation 0008eb3 @nemequ
neon/subw_high: initial implementation 4935cd4 @nemequ
neon/addw_high: initial implementation adf12f2 @nemequ
neon/uqadd: initial implementation 451136b @nemequ
neon/mul_lane: initial implementation 92e9df1 @nemequ
neon/mlsl_n: initial implementation 72497e7 @nemequ
neon/cls: initial implementation e6dde92 @nemequ
neon/qshl: initial implementation b266b2b @nemequ
neon/max: fix unsafe SSE2 implementation of vmaxq_f64 b45b259 @nemequ
neon/minnm, neon/maxnm: correct C&P errors in floating point functions 6958298 @rosbif
neon/shl_n, neon/shr_n: add GFNI-based 8-bit shifts 177e5e1 @nemequ
neon/movn_high: initial implementation 0e3e3fd @nemequ
neon/rnd: initial implementation 1bbc67e @nemequ
neon: fix detection of A32 functionality 8ff3a8f @nemequ
neon/mlal_n: initial implementation 7a2f504 @nemequ
neon/qsub: initial implementation 6db7032 @nemequ

SVML

svml: add shorter fallbacks for remaining functions 4400413 @nemequ
svml: GCC bug #53784 also occurs on s390x 5c2d66f @nemequ
svml: fix portable fallback for simde_x_mm512_deg2rad_{pd,ps} d33d0c7 @nemequ
svml: more work-arounds for GCC bug #53784 615ba1b @nemequ

x86 intrinsics

Fix compilation failures when targeting 32-bit x86 with >= SSE2 25b5fbc 82d0065 @nemequ
test/x86: add test_simde_mm{,256}_mask{,z}_xxx_epi{8,16} to skel f1c824f @ashnewmanjones
test/x86: add NaN test case generation functions to x86 d3384dd @nemequ
x86: add SIMDE_REQUIRE{,_CONSTANT}_RANGE macros to many functions 396a018 @ashnewmanjones

MMX

mmx: fix NEON implementation of _mm_srai_pi16 7c416cf @nemequ
mmx: work around some clang <= 11 bugs on POWER9 99c0b39 @nemequ

SSE*

sse/sse2/ssse3: more WASM implementations: _mm_srli_epi{16,32,64} _mm_srl_epi{32,64} 63e63ed _mm_cvt{epi32,si32,si64,si128}_* dd21f30 _mm_sra{,i}_epi{16,32} 3bd7ea9 mm_cmp{un}ord_ps ef06821 simde_mm_sign_epi{8,16,32} 55c5619 @masterchef2209
sse2: add WASM implementation of _mm_unpackhi_pd 4cd0b90 @zekehul
sse, neon/abs: _mm512_abs_ps was introduced in GCC 7.1 fb2a06f @milot-mirdita
sse2: simde_x_mm_abs_pd throws cast errors before GCC 7.4 f70e34c @milot-mirdita
sse2: fix NEON simde_mm_cmp_pd implementation 8bc8b12 @nemequ
sse, sse2: add several AltiVec, WASM, and NEON implementations 08db479 @nemequ
sse: add __builtin_nontemporal_store version of simde_mm_stream_ps 9a8001e @nemequ
sse2: rewrite the NEON implementation of simde_mm_sad_epu8 c520b2d @nemequ
sse2: improve simde_mm_madd_epi16 NEON & AltiVec implementations 55f703f @nemequ
sse4.1: add SSE2 and shuffle-based fallbacks for _mm_cvtepi*_epi* 197610c @nemequ
sse4.1: improve AArch64 _mm_dp_{ps,pd} implementations 3ebf82f @nemequ
sse: fix NaN handling for _mm_max_ps, update test case 15aa0c4 @nemequ
sse2: add shuffle-based implementation of _mm_mul_epu32 e2da067 @nemequ
sse2: improve NEON implementations of _mm_mulhi_ep{i,u}16 f7546c7 @nemequ
sse3: improve some NEON implementations 444cae1 @nemequ
ssse3: formatting fixes a560e2e @nemequ
ssse3: improve some NEON implementations 858d169 @nemequ
sse3: armv7 implementations of deinterleave functions fa158d1 @nemequ
sse3: improve NEON implementation of hadd/hsub functions d9e860e @nemequ
ssse3: many new or improved NEON implementations of pairwise functions 94b9c2f @nemequ
sse2: add missing mm_cmpngt_{pd,sd} 8a2d249 @ktgw0316
sse, sse2, sse4.1: fix ties-toward-even rounding 3208aeb @nemequ
sse4.1: better testing of _mm_round_ps b6a7310 @nemequ
sse: add simde_x_mm_round_ps with lax_rounding argument 24e5926 @nemequ

AVX

avx: require x86_64 for _mm256_insert_epi64 82d0065 @nemequ
avx: simplify some broadcast functions bbcba0a @nemequ
avx, avx512: add missing undef directives for native aliases bb944be @nemequ

AVX2

avx2: squash clang -Weverything warning in portabl _mm256_movemask_epi8 f3de4d9 @nemequ
avx2: add NEON and 128-bit implementations of several shift functions 31fe86d @nemequ
avx2, avx512/madd: add non-vector fallbacks 90503ed @nemequ
avx2: add some fallbacks on 128-bit functions 080c2e6 @nemequ

AVX512

avx512: refactor AVX-512 implementations to be structured like NEON bc7bfdc @nemequ
avx512/add: implement simde_mm_mask{,z}_add_ss d4bb2ad @himanshi18037
avx512/add: _mm_mask{,z}_add_ss was not available in GCC until 8.1 4af1c3a @nemequ
avx512/broadcast: correct feature checks for several functions 17f11f7 @nemequ
avx512: correct many feature tests 344a666 @nemequ
gh-actions: add avx512 builds face9ad @nemequ
avx512/extract: work around ICE on GCC 6 249d926 @nemequ
avx512/s{l,r}li: use CONSTIFY macros on certain GCC versions 9ecf9f2 @nemequ
avx512/s{l,r}li: add missing native versions of _mm512_s{l,r}li_epi16 239d484 @nemequ
avx512/add: fix simde_mm_mask{,z}_add_ss 12a2b5c @nemequ
avx512/extract: work around GCC 6 ICE fffe70f @nemequ
test/avx512: fix function for writing mmask variables 8c806d3 @nemequ
avx512/srl: fix portable fallbacks ffb8515 @nemequ
avx512/fm*: fix typo in portable _mm512_fm*_{ps,pd} fallbacks 119de0b @nemequ
avx512/loadu: add remaining loadu functions and tests cfe173d @nemequ
avx512/mov_mask: implement simde_mm{,256}_movepi{8,16,32,64}_mask e54dde8 @nemequ
avx512/srlv: add simde_mm512_srlv_epi{32,64} e253dff @anrodrig
avx512/srlv: implement several srlv functions and tests d05d2eb @nemequ
avx512/blend: implement remaining blend functions 16d99c3 @nemequ
avx, avx512: add missing undef directives for native aliases bb944be @nemequ
avx512/fma: use fmaf instead of fma fol 32-bit floats f578fd5 @nemequ
avx512/div: add 256-bit fallbacks abfb353 @nemequ
avx512bw: implement mm512_mask{,z}_unpackhi_epi{8,16} 0484698 @ashnewmanjones
avx512/avg: implement simde_mm_mask{,z}avg_epu{8,16} 542c52b @himanshi18037
avx512/setzero: add mm512_setzero_p{s,d} tests a26d3d1@ashnewmanjones
avx512/set: add mm512_set{epi{8,16,32,64},pd} tests 305e134 @ashnewmanjones
avx512vp2intersect: initial implementation a67e1be @ashnewmanjones
avx512/madd: initial implementation e8882b9 @ashnewmanjones
avx2, avx512/madd: add non-vector fallbacks 90503ed @nemequ
avx512/maddubs: implement maddubs functions 42ca3bd @ashnewmanjones
avx512/sll: add simde_mm512_mask{,z}_sll_epi16 functions 26ac148 @ashnewmanjones
avx512/avg: implement remaining avg functions abf7bd2 @ashnewmanjones
avx512/abs: add fallbacks on shorter vectors c82542d @nemequ
avx512/abs: add NEON and AltiVec implementations b47f166 @nemequ

GFNI

gfni: lower requirements for some functions 5dba288 @nemequ

Testing with Docker/Podman & CI

test: add code to generate special vectors for better coverage d0be929 @nemequ

azure-pipelines: add commented out loongson build b860895 @nemequ

travis: add gcc-6 and clang-3.5 builds 721c925 @nemequ
travis: use GCC 10 for AArch64 build b3a1794 @nemequ
travis: Add MIPS Loongson-MMI (Compile Only) 6537329 @FlyGoat
travis: new package name for intel oneapi beta10 6f6a0b1 @rscohn2

gh-actions: add avx512 builds face9ad @nemequ
gh-actions: disable xcode 10.3 build fe52903 @nemequ
gh-actions: update repo before (trying to) install pcre2grep e92f9ae @nemequ
gh-actions: read /proc/cpuinfo 8b3b405 @nemequ

testing with docker improvements a5c5826 c0c8c01 cf0cf14 @nemequ
docker: assorted clean-ups and documentation improvements a5c5826 @nemequ
docker: add 32-bit x86 builds c0c8c01 0d5a036 @nemequ
docker: add POWER clang builds cf0cf14 @nemequ
docker: add loongson and mips64el+msa builds c30b910 @nemequ
docker: add -futur...

Assets 2

24 Aug 15:50

mr-c

v0.6.0

236e90a

v0.6.0

379 commits from 9 contributors, changing 273 files!

Full changelog

Assets 3

22 Jun 19:12

nemequ

v0.5.0

4e00ba3

0.5.0

I’m pleased to announce the availability of the first release of SIMD
Everywhere (SIMDe),
version 0.5.0,
representing more than three years of work by over a dozen developers.

SIMDe is a permissively-licensed (MIT) header-only library which
provides fast, portable implementations of
SIMD intrinsics for platforms
which aren’t natively supported by the API in question.

For example, with SIMDe you can use
SSE on
ARM,
POWER,
WebAssembly, or almost any platform with a
C compiler. That includes, of course, x86 CPUs which don't support
the ISA extension is question (e.g., calling AVX-512F functions on a
CPU which doesn't natively support them).

If the target natively supports the SIMD extension in question there
is no performance penalty for using SIMDe. Otherwise, accelerated
implementations, such as NEON on ARM, AltiVec on POWER, WASM SIMD on
WebAssembly, etc., are used when available to provide good
performance.

SIMDe has already been used to port several packages to additional
architectures through either upstream support or distribution
packages, particularly on
Debian.

If you'd like to play with SIMDe online, you can do so on Compiler
Explorer.

What is in 0.5.0

The 0.5.0 release is SIMDe’s first release. It includes complete
implementations of:

MMX
SSE
SSE2
SSE3
SSSE3
SSE4.1
AVX
FMA
GFNI

We also have rapidly progressing implementations of many other
extensions including NEON, AVX2, SVML, and several AVX-512 extensions
(AVX-512F, AVX-512BW, AVX-512VL, etc.).

Additionally, we have an extensive test suite to verify our
implementations.

What is coming next

Work on SIMDe is proceeding rapidly, but there are a lot of functions
to implement… x86 alone has about 6,000 SIMD functions, and we’ve
implemented about 2,000 of them. We will keep adding more functions
and improving the implementations we already have.

Our NEON implementation is being worked on very actively right now
by Sean Maher and Christopher Moore, and is expected to continue
progressing rapidly.

We currently have two Google Summer of Code students working on the
project as well; Hidayat
Khan
is working on finishing up AVX2, and Himanshi
Mathur is focused on SVML.

If you're interested in using SIMDe but need some specific functions
to be implemented first, please file an
issue and we may
be able to prioritize those functions.

Getting Involved

If you're interested in helping out please get in touch. We have a
chat room on Gitter
which is fairly active if you have questions, or of course you can
just dive right in on the issue
tracker.

Assets 2

Releases: simd-everywhere/simde

SIMDe 0.7.2

Summary

Details

Implementation of NEON intrinsics:

x86 intrinsics

SSE*

AVX

AVX512

XOP

Testing with Docker/Podman & CI

Misc

SIMDe 0.7.0

Version 0.7.0 Summary

Changes since 0.7.0-rc2

Implementation of NEON intrinsics:

x86 intrinsics

SSE*

AVX512

Misc

SIMDe v0.7.0-rc2

Summary

Details

Implementation of NEON intrinsics:

SVML

x86 intrinsics

SSE*

AVX / AVX2

AVX512

CLMUL

Testing with Docker/Podman & CI

Misc

v0.7.0-rc-1

Summary

Details

Implementation of NEON intrinsics:

SVML

x86 intrinsics

MMX

SSE*

AVX

AVX2

AVX512

GFNI

Testing with Docker/Podman & CI

v0.6.0

0.5.0

What is in 0.5.0

What is coming next

Getting Involved