Skip to content
This repository has been archived by the owner on Apr 28, 2023. It is now read-only.

Support -mpcu flag and autodetect to trigger vectorization on proper vector lengths #588

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nicolasvasilache
Copy link
Contributor

@nicolasvasilache nicolasvasilache commented Jul 26, 2018

This PR passes proper llvm::TargetMachine information in
llvm_jit and codegen_llvm by introducing a proper TargetMachine
at the LLVMJit level and avoids introducing adhoc objects.

The TargetMachine is constructed either from the --mcpu flag if
passed or from the cpuid information.

As a consequence of all this, one can now emit AVX2 and AVX512 code.
Before this commit, the TargetMachine was essentially a default one
and only AVX code would be generated.

To test and see it one can run with:

cd build && \
make -j 16 test_mapper_llvm && \
./test/test_mapper_llvm --logtostderr=1 --llvm_dump_asm=1 --llvm_dump_after_opt=1 --llvm_dump_before_opt=1 --gtest_filter="*Batch*" --mcpu=skylake

Of course if one forces a more fancy architecture than one has,
illegal instructions will likely be generated but at least the asm
will be printed properly.

It will be reused in another location in the next commit.
We are now on trunk this was leftover from Tapir days.
This commit introduces a --llvm_dump_asm flag and the corresponding
--llvm_dump_asm_options to emit assembly to LOG(INFO)
This commit adds the --mcpu flag to TC so we can emit LLVM and asm
for different architectures.
This commit introduces and uses cpuid information to pass the
proper llc `mcpu` flag.
This commit passes proper llvm::TargetMachine information in
llvm_jit and codegen_llvm by introducing a proper TargetMachine
at the LLVMJit level and avoids introducing adhoc objects.

The TargetMachine is constructed either from the `--mcpu` flag if
passed or from the `cpuid` information.

As a consequence of all this, one can now emit AVX2 and AVX512 code.
Before this commit, the TargetMachine was essentially a default one
and only AVX code would be generated.

To test and see it one can run with:
```
cd build && \
make -j 16 test_mapper_llvm && \
./test/test_mapper_llvm --logtostderr=1 --llvm_dump_asm=1 --llvm_dump_after_opt=1 --llvm_dump_before_opt=1 --gtest_filter="*Batch*" --mcpu=skylake
```

Of course if one forces a more fancy architecture than one has,
illegal instructions will likely be generated but at least the asm
will be printed properly.
This commit avoids leaking all the guts of LLVMCodegen to the
emitLLVMKernel function and restructures some code.
@nicolasvasilache nicolasvasilache changed the title Pr/more cpu mapper Support -mpcu flag and autodetect to trigger vectorization on proper vector lengths Jul 27, 2018
@facebook-github-bot
Copy link

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours has expired.

Before we can review or merge your code, we need you to email [email protected] with your details so we can update your status.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants