-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Masked compare and floating point classifications #2427
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1004,6 +1004,24 @@ Per-lane variable shifts (slow if SSSE3/SSE4, or 16-bit, or Shr i64 on AVX2): | |
neither NaN nor infinity, i.e. normal, subnormal or zero. Equivalent to | ||
`Not(Or(IsNaN(v), IsInf(v)))`. | ||
|
||
#### Masked floating-point classification | ||
|
||
All ops in this section return `false` for `mask=false` lanes. These are | ||
equivalent to, and potentially more efficient than, `And(m, Eq(a, b));` etc. | ||
|
||
* `V`: `{f}` \ | ||
<code>M **MaskedIsNaN**(V v)</code>: returns mask indicating whether `v[i]` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's add the mask argument to the documentation :) |
||
is "not a number" (unordered) or `false` if `m[i]` is false. | ||
|
||
* `V`: `{f}` \ | ||
<code>M **MaskedIsInf**(V v)</code>: returns mask indicating whether `v[i]` | ||
is positive or negative infinity or `false` if `m[i]` is false. | ||
|
||
* `V`: `{f}` \ | ||
<code>M **MaskedIsFinite**(V v)</code>: returns mask indicating whether | ||
`v[i]` is neither NaN nor infinity, i.e. normal, subnormal or zero or | ||
`false` if `m[i]` is false. Equivalent to `Not(Or(IsNaN(v), IsInf(v)))`. | ||
|
||
### Logical | ||
|
||
* `V`: `{u,i}` \ | ||
|
@@ -1477,6 +1495,29 @@ These return a mask (see above) indicating whether the condition is true. | |
for comparing 64-bit keys alongside 64-bit values. Only available if | ||
`HWY_TARGET != HWY_SCALAR`. | ||
|
||
#### Masked comparison | ||
|
||
All ops in this section return `false` for `mask=false` lanes. These are | ||
equivalent to, and potentially more efficient than, `And(m, Eq(a, b));` etc. | ||
|
||
* <code>M **MaskedCompEq**(M m, V a, V b)</code>: returns `a[i] == b[i]` or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we just call it MaskedEq for consistency with the usual naming convention, which is only to prepend |
||
`false` if `m[i]` is false. | ||
|
||
* <code>M **MaskedCompNe**(M m, V a, V b)</code>: returns `a[i] != b[i]` or | ||
`false` if `m[i]` is false. | ||
|
||
* <code>M **MaskedCompLt**(M m, V a, V b)</code>: returns `a[i] < b[i]` or | ||
`false` if `m[i]` is false. | ||
|
||
* <code>M **MaskedCompGt**(M m, V a, V b)</code>: returns `a[i] > b[i]` or | ||
`false` if `m[i]` is false. | ||
|
||
* <code>M **MaskedCompLe**(M m, V a, V b)</code>: returns `a[i] <= b[i]` or | ||
`false` if `m[i]` is false. | ||
|
||
* <code>M **MaskedCompGe**(M m, V a, V b)</code>: returns `a[i] >= b[i]` or | ||
`false` if `m[i]` is false. | ||
|
||
### Memory | ||
|
||
Memory operands are little-endian, otherwise their order would depend on the | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1783,6 +1783,77 @@ HWY_API svbool_t IsFinite(const V v) { | |
return RebindMask(d, detail::LtN(exp, hwy::MaxExponentField<T>())); | ||
} | ||
|
||
// ------------------------------ MaskedCompEq etc. | ||
#ifdef HWY_NATIVE_MASKED_COMP | ||
#undef HWY_NATIVE_MASKED_COMP | ||
#else | ||
#define HWY_NATIVE_MASKED_COMP | ||
#endif | ||
|
||
// mask = f(mask, vector, vector) | ||
#define HWY_SVE_COMPARE_Z(BASE, CHAR, BITS, HALF, NAME, OP) \ | ||
HWY_API svbool_t NAME(svbool_t m, HWY_SVE_V(BASE, BITS) a, \ | ||
HWY_SVE_V(BASE, BITS) b) { \ | ||
return sv##OP##_##CHAR##BITS(m, a, b); \ | ||
} | ||
|
||
namespace detail { | ||
HWY_SVE_FOREACH(HWY_SVE_COMPARE_Z, MaskedEq, cmpeq) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can expose these directly. Rather than putting them in detail:: and adding a wrapper function, you can just remove the namespace detail, and specify the desired name of the op as the second to last argument (MaskedEq is already good IMO). |
||
HWY_SVE_FOREACH(HWY_SVE_COMPARE_Z, MaskedNe, cmpne) | ||
HWY_SVE_FOREACH(HWY_SVE_COMPARE_Z, MaskedLt, cmplt) | ||
HWY_SVE_FOREACH(HWY_SVE_COMPARE_Z, MaskedLe, cmple) | ||
|
||
} // namespace detail | ||
|
||
#undef HWY_SVE_COMPARE_Z | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompEq(M m, V a, V b) { | ||
return detail::MaskedEq(m, a, b); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompNe(M m, V a, V b) { | ||
return detail::MaskedNe(m, a, b); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompLt(M m, V a, V b) { | ||
return detail::MaskedLt(m, a, b); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompGt(M m, V a, V b) { | ||
// Swap args to reverse comparison | ||
return detail::MaskedLt(m, b, a); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompLe(M m, V a, V b) { | ||
return detail::MaskedLe(m, a, b); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedCompGe(M m, V a, V b) { | ||
// Swap args to reverse comparison | ||
return detail::MaskedLe(m, b, a); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedIsInf(const M m, const V v) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we ever plan to provide a faster implementation of MaskIfInf/IsFinite, or can those be removed? |
||
return And(m, IsInf(v)); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedIsFinite(const M m, const V v) { | ||
return And(m, IsFinite(v)); | ||
} | ||
|
||
template <class V, class M, class D = DFromV<V>> | ||
HWY_API MFromD<D> MaskedIsNaN(const M m, const V v) { | ||
return detail::MaskedNe(m, v, v); | ||
} | ||
|
||
// ================================================== MEMORY | ||
|
||
// ------------------------------ LoadU/MaskedLoad/LoadDup128/StoreU/Stream | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And(m, IsNaN)?