Skip to content

MLIR based CI Configuration and Github Action Caches

Michel Weber edited this page Nov 30, 2022 · 3 revisions

This page documents the configuration of our MLIR-based CI. Furthermore, the design around the github action caches is explained.

CI configuration

  • On PRs/Pushes on main, the MLIR-based CI workflow is triggered.
  • From the MLIR repo, only the commit we want is cloned (no history, no branches). This commit is hardcoded as an environment variable.
  • The workflow only compiles the things necessary for the mlir-opt executable and the MLIR Python bindings, as both are needed in our tests.
  • If we want to speed this CI job up further, we could pre-compile the mlir-opt executable and pull it in each time. Though notice that this will need manual recompilation if we decide to update the commit hash for the CI, whereas the current version does this fully automatically.
  • Notice that this CI job only runs the tests in tests/filecheck/mir-conversion/with-bindings as all others are already run in the Python-based Testing job.
  • Currently, the ninja installation seems to have issues with nodes that are deprecated. This might lead to this workflow failing in summer 23. I am piggybacking off of some of googles repos, so I am assuming that they will update this, when it becomes really necessary. Still, we should keep this in mind.
  • In order to not compile all of MLIR in each execution, the workflow uses CCache for gh actions to cache compiled files. This cache is saved on the github action cache and restored each time before trying to rebuild.
  • If we need to, this cache can be cleaned by following this.
  • Notice that the compiled files need to be linked again for each execution. I moved to clang with LLD to speed this up.

Github action caches (official docu)

Github action caches can be used to store data in between CI runs. In this way, we can save a lot of execution time since building MLIR needs about 2h, which would make it hard to quickly iterate on PRs. These caches can not only be shared among runs of the CI on one branch, but also among different branches. This is how this works:

Caches are created with specific "refs". These refs are the events that cause the execution of the CI, so a) the push onto a branch or b) opening a PR on a branch. The github actions only have access to some caches: in the case of a) only for caches created with the same ref (and the default branch, i.e. main) and for b) the caches of the PR ref, the ref of the branch to be merged, the ref of the base branch and the ref of main.

So say we have feature-a based on main and feature-b based on feature-a. On pushing to the feature-b branch, an executing CI job has access to all caches created on its own ref and on the main ref. On opening a PR to merge feature-b into feature-a, this has access to all caches with ref feature-b, feature-a, and main. In subsequent runs of the CI on the same PR, it also gains access to caches created on previous CI actions on this PR.

For a specific example, look at https://github.com/xdslproject/xdsl/pull/248. This is a branch on top of mlir_based_ci that wants to merge back into it. Tow actions were triggered: one on the push to this branch, the other on the PR. For the push, you can see that the jobs was (hand-)terminated, because the cache was not found. On the other hand, for the CI job on the PR creation, it succeeded in 3 minutes by finding the proper cache. This cache has previously been allocated on the ref of the mlir_based_ci branch by a push to that branch.

In our repo, the merge of the #243 created a push and, therefore, a cache, on the main ref. In the future, when creating PRs, these PRs can then access this ref. However, checking caches on other refs needs to be done with the restore-keys parameter. The normal key parameter only searches on the ref of this particular job.