intrinsic-test: sve support #2160
Conversation
c29cf85 to
6561db8
Compare
9ff4174 to
e817965
Compare
This comment was marked as resolved.
This comment was marked as resolved.
95f286a to
ed132a7
Compare
| } = **self | ||
| { | ||
| let quad = if self.num_lanes() * bl > 64 { "q" } else { "" }; | ||
| fn comparison_function(&self) -> String { |
There was a problem hiding this comment.
Is there some way to print the diff for SVE? I would imagine it is difficult due to SVE being non-const sized, but is there some way? That would massively improve debuggability. One approach I can suggest is have some small functions that convert from SVE vectors to &[T], e.g.
#[target_feature(enable = "sve")]
pub fn svfloat32_to_slice(a: &svfloat32_t) -> &[NanEqF32] {
unsafe {
core::slice::from_raw_parts(core::ptr::from_ref(a).cast(), svcntw() as usize)
}
}this might work, with significantly less complexity
There was a problem hiding this comment.
When I've needed to debug them, I've disabled the invocation of intrinsic-test in intrinsic-test.sh and added these snippets to the generated tests that I wanted to debug.
For non-bool vectors, just replace the x0_val and x1_val with the vectors that you want to inspect, e.g. __rust_return_value or __c_return_value:
{
let num_elems = svcnth() as usize;
let mut x0_buf = Vec::with_capacity(num_elems);
let mut x1_buf = Vec::with_capacity(num_elems);
svst1_u16(__pred, x0_buf.as_mut_ptr(), x0_val);
x0_buf.set_len(num_elems);
svst1_u16(__pred, x1_buf.as_mut_ptr(), x1_val);
x1_buf.set_len(num_elems);
for i in 0..num_elems {
let x0_val = x0_buf[i];
dbg!(i, x0_val);
}
for i in 0..num_elems {
let x1_val = x1_buf[i];
dbg!(i, x1_val);
}
}Similarly for bool vectors:
{
let num_elems = svcntb() as usize;
let mut _op_val = Vec::with_capacity(num_elems);
svst1_u8(op_val, _op_val.as_mut_ptr(), svdup_n_u8(1));
_op_val.set_len(num_elems);
for i in 0..num_elems {
let _op_val_el = if _op_val[i] == 1 { true } else { false };
dbg!(i, _op_val_el);
}
}I'm not sure if what you've suggested will work, happy to try. I'm more than happy to make some improvements here for debuggability of the tests when they do fail, but could that be left to a follow-up?
There was a problem hiding this comment.
Ended up doing this in the current patch because it was useful for working out what was going on with CI failures
There was a problem hiding this comment.
Sorry, didn't understand what you meant by this and the current patch, the GH version shows the sveor version. Also, https://godbolt.org/z/a481YdW1P seems to work.
| " | ||
| {intrinsics} | ||
| "#, | ||
| intrinsics = intrinsics.iter().format_with("", |intrinsic, fmt| { |
There was a problem hiding this comment.
very minor nit, but you can use something like intrinsic.specializations().for_each(|imm_values| fmt(...))
There was a problem hiding this comment.
I wouldn't have a fmt to call if I used for_each
|
also, it would be nice if this can be split into a few focused PRs (I can help if you want) |
|
Submitted rust-lang/rust#158088 to fix the failure in debug mode - should pass CI after that |
I'll try and split some out tomorrow |
…o-no-opt, r=lqd codegen_ssa: no dbginfo for scalable vec local w/ `-O0` LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed. cc llvm/llvm-project#204585 cc rust-lang/stdarch#2160 r? @lqd
Rollup merge of #158088 - davidtwco:scalable-vector-debuginfo-no-opt, r=lqd codegen_ssa: no dbginfo for scalable vec local w/ `-O0` LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed. cc llvm/llvm-project#204585 cc rust-lang/stdarch#2160 r? @lqd
… r=lqd codegen_ssa: no dbginfo for scalable vec local w/ `-O0` LLVM uses GlobalISel with `-O0` that doesn't support scalable vectors. It normally falls back to SDAG which does support scalable vectors, but there's a bug that means that isn't happening for debuginfo - so temporarily don't emit debuginfo for scalable vector locals when there are no optimisations until that bug is fixed. cc llvm/llvm-project#204585 cc #2160 r? @lqd
|
Split out the first batch of changes in #2169 |
ed132a7 to
ea73af3
Compare
|
Split out the second batch of changes in #2170 |
|
Split out the third batch of changes in #2171 |
|
Split out fourth batch of changes in #2172 |
|
Split out final batch of changes in #2173. Shouldn't be much less other than the actual SVE testing implementation here after those are merged, but there will be a few merge conflicts between them as they land. |
This comment has been minimized.
This comment has been minimized.
ea73af3 to
060d803
Compare
|
Rebased after all the prerequisites landed - should pass CI now, works in dev and release locally and the rustc patches are all in nightly - but I'll check back in tomorrow |
This is just more helpful for knowing what all needs to be fixed when CI fails.
Replacing `iter_specializations` (which repeatedly invokes a callback) with an iterator implementation allows `Itertools::format_with` to be used more broadly, which in turn allows disparate string interpolation to be combined and hopefully provide greater context to the reader.
Updates `get_load_function` to return `svld{n}_{ty}` when loading a
scalable vector type. Caller of `get_load_function` will still need
updated to handle passing the predicate arguments to these load
functions.
Updates the headers used by generated C code and the target feature flags passed to the C compiler to enable SVE.
Some SVE intrinsics take booleans as arguments, so there is a need to support generating a test value array for booleans.
Constraints that correspond to enum types - such as `svpattern` and `svprfop` - need to be converted to the enum type in order to be used in a generic instantiation - so introduce a const function for both types that provides this mapping.
Predicate arguments of type `svbool_t` do not need test value arrays to be generated as the same enable-all-lanes predicate will be passed to all invocations of the intrinsic under test. There is no `svld1` equivalent for `svbool_t` that could be used even if there were test values to use.
Support defining a local variable containing the predicate that will be used with all subsequent scalable vector intrinsics.
Implementation of `get_comparison_function` and `get_predicate_function` for SVE which uses the relevant SVE intrinsics.
Instead of assuming that any scalable boolean argument is a predicate, handle predicates specifically and generate test values for `svbool_t` values.
SVE isn't a baseline target feature for `aarch64-unknown-linux-gnu` but should be enabled when running tests.
Like with non-SVE test generation, comparison of float results in scalable vectors need special-handling of comparisons.
Enables generation of tests for SVE intrinsics leveraging the changes from the previous commits.
This makes it far easier to debug what's potentially gone wrong with an intrinsic test.
060d803 to
c04381b
Compare
Same as previous redefinitions in stdarch#2163 - these were missed in that PR because the hardware being tested on was missing the hardware feature required for the instructions these use.
|
Pushed c89cacc to fix the failures with Only remaining failures are |
I think it's just that rust-lang/rust#157915 is wrong - it works by chance for me locally because of the vector length my hardware has (and let me assume that maybe |
|
This will be fixed once rust-lang/rust#158253 lands and is on nightly |
This patch contains all the changes necessary for
intrinsic-testto generate tests for the SVE intrinsics:Some of the commits are just general refactors to the tool, motivated by the commits that follow and that I felt were hard to justify on their ownpopulate_randomand related code #2126, intrinsic-test: removearm::argumentmodule #2127)SupportedArchitecture#2170, intrinsic-test: removeconcatln!and redundant newlines #2171, intrinsic-test: simplify architecture constants #2172, intrinsic-test: simplify architecture constants #2172, intrinsic-test: simplify type printing + comparison abstraction #2173)Some of the commits are small changes I made along the way (e.g. forwarding args fromintrinsic-test.shtocargo test)I could split these out if we wantedintrinsic-test.sh#2164Some of the commits modify the definitions of intrinsics so that they pass tests - this only happened for a handful of intrinsics and the changes required to their definitions were very minorI could split these out if we wantedsvrev,svzipandsvuzp#2163Some of the commits updatearm_intrinsics.jsonto add some additional constraints necessary to generate the right tests (we're making sure that these addl. constraints are reflected in the source of truth we use to generate that file internally)I could split these out if we wantedarm_intrinsics.jsonforsvsetandsvget#2162arm_sve.hheaders and appropriate target feature flags for C and Rustsvld1intrinsic when loading an argument of scalable vector typesvbool_targuments which do not have ansvld1intrinsic, we load asvint8_tand usesvcmpne_n_s8against zero to produce asvbool_tfrom itsvbool_tvalues (though the earlier commits use the samesvptrue_$typredicate for these arguments initially)svpattern), using simple const functions that map integers from intrinsic-test constraints to the appropriate variantsvptrue_$ty()to enable every lane)svcmpeq_$tyandsvptest_anyto compare the Rust and C results instead ofassert_eq!svcmpeq_$tyandsvptest_anyblocks are generated for each vector in the tuple, withsvgetNcalls to extract the appropriate vector from the result tuplessvbool_t, where there is no equivalentsvcmpeq_$tycall, sosveor_b_z(exclusive OR) and!svptest_anyis usedNanEqF*type cannot be used, so a(rust == c) || (isnan(rust) && isnan(c))check is performed instead, via a handful of SVE intrinsics