Add new generic matrix operators #137

stephenswat · 2024-11-19T11:21:16Z

Our current approach for e.g. matrix multiplication reads very naturally, e.g. A = B*C models $A = BC$, but this has a problem on GPU code. Indeed, if these matrices have size $N \times N$, then we first concretize an $N \times N$ matrix which we then have to copy element-by-element into $A$. This means that we need to keep that many registers live, which is quite large for e.g. $7 \times 7$ free matrices (which thus occupy 49 registers).

This problem can be alleviated by implementing optimized operators. More precisely, this PR implements the following new methods:

set_product(A, B, C) computes $A = BC$ without intermediate values.
set_product_left_transpose(A, B, C) computes $A = B^TC$ without intermediate values.
set_product_right_transpose(A, B, C) computes $A = BC^T$ without intermediate values.
set_inplace_product_right(A, B) computes $A = AB$ in place.
set_inplace_product_left(A, B) computes $A = BA$ in place.
set_inplace_product_right_transpose(A, B) computes $A = AB^T$ in place.
set_inplace_product_left_transpose(A, B) computes $A = B^TA$ in place.

Includes tests; depends on #134.

stephenswat · 2024-11-19T11:25:09Z

I was considering making the transpose on the left- and right-hand size a boolean parameter or even a boolean template, but I wasn't sure... 🤔

stephenswat · 2024-11-19T12:56:47Z

@niermann999 I think the generic plugin (#131) would actually be super useful here; do you think we can incorporate this after merging the generic plugin?

niermann999 · 2024-11-19T16:11:02Z

@niermann999 I think the generic plugin (#131) would actually be super useful here; do you think we can incorporate this after merging the generic plugin?

Yes, sure. It should be possible to use this with any linear algebra backend, also the vectorized ones (I did this in the last PR with the determinant and inverse now, until I had time to look into optimizations). Unfortunately, the full implementation of the generic plugin kind of happened on the way, so it is three PRs down the pipeline...

stephenswat · 2024-11-21T13:43:07Z

@niermann999 do you have some ideas on how I would now move this to your new generic algebra plugin?

niermann999 · 2024-11-21T15:01:29Z

Can you also try to add this operation to the matrix benchmarks?

beomki-yeo · 2024-11-22T16:33:42Z

I am onboard with the direction but could you make this operators more CPU-vectorization friendly?

I am not sure if the same technique used here can be applied

stephenswat · 2024-12-02T12:13:03Z

I am onboard with the direction but could you make this operators more CPU-vectorization friendly?

Hi Beomki, thanks for the comment but I am not sure which of the functions you mean; where do you see any particular vectorization unfriendliness?

beomki-yeo · 2024-12-03T12:29:18Z

@stephenswat Sure - Following is the part of set_product

        for (size_type k = 0; k < N; ++k) {
          T += element_getter()(A, i, k) * element_getter()(B, k, j);
        }

element getter is taking A(i,k). As [A(i,0), A(i,1), ..., A(i, N-1)] are not adjacent to each other, taking these values is not vectorization friendly.

But we can make this fix in one of next PRs

stephenswat · 2024-12-03T14:39:20Z

@niermann999 this is now ready for review.

common/include/algebra/concepts.hpp

tests/common/test_host_basics.hpp

stephenswat · 2024-12-03T17:00:33Z

Ah yes, more obscure MSVC clownery. 🤡

Our current approach for e.g. matrix multiplication reads very naturally, e.g. `A = B*C` models $A = BC$, but this has a problem on GPU code. Indeed, if these matrices have size $N \times N$, then we first concretize an $N \times N$ matrix which we then have to copy element-by-element into $A$. This means that we need to keep that many registers live, which is quite large for e.g. $7 \times 7$ free matrices (which thus occupy 49 registers). This problem can be alleviated by implementing optimized operators. More precisely, this PR implements the following new methods: * `set_product(A, B, C)` computes $A = BC$ without intermediate values. * `set_product_left_transpose(A, B, C)` computes $A = B^TC$ without intermediate values. * `set_product_right_transpose(A, B, C)` computes $A = BC^T` without intermediate values. * `set_inplace_product_right(A, B)` computes $A = AB$ in place. * `set_inplace_product_left(A, B)` computes $A = BA$ in place. * `set_inplace_product_right_transpose(A, B)` computes $A = AB^T$ in place. * `set_inplace_product_left_transpose(A, B)` computes $A = B^TA$ in place.

sonarqubecloud · 2024-12-04T14:18:25Z

Quality Gate failed

Failed conditions
21.1% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

stephenswat · 2024-12-04T15:54:23Z

@niermann999 finally passes the CI; what do you think?

stephenswat added the enhancement New feature or request label Nov 19, 2024

stephenswat requested review from krasznaa, niermann999 and beomki-yeo November 19, 2024 11:21

stephenswat force-pushed the feat/more_mat_ops branch from 376d545 to ef228a1 Compare December 3, 2024 14:36

stephenswat force-pushed the feat/more_mat_ops branch from ef228a1 to e486e2a Compare December 3, 2024 14:49

niermann999 reviewed Dec 3, 2024

View reviewed changes

common/include/algebra/concepts.hpp Outdated Show resolved Hide resolved

tests/common/test_host_basics.hpp Outdated Show resolved Hide resolved

stephenswat force-pushed the feat/more_mat_ops branch 5 times, most recently from c37e64c to 800ed7b Compare December 3, 2024 16:47

stephenswat force-pushed the feat/more_mat_ops branch from 800ed7b to 05fd7a7 Compare December 4, 2024 14:17

niermann999 approved these changes Dec 4, 2024

View reviewed changes

stephenswat merged commit d167b11 into acts-project:main Dec 4, 2024
27 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new generic matrix operators #137

Add new generic matrix operators #137

stephenswat commented Nov 19, 2024 •

edited

Loading

stephenswat commented Nov 19, 2024 •

edited

Loading

stephenswat commented Nov 19, 2024

niermann999 commented Nov 19, 2024

stephenswat commented Nov 21, 2024

niermann999 commented Nov 21, 2024

beomki-yeo commented Nov 22, 2024 •

edited

Loading

stephenswat commented Dec 2, 2024

beomki-yeo commented Dec 3, 2024

stephenswat commented Dec 3, 2024

stephenswat commented Dec 3, 2024

sonarqubecloud bot commented Dec 4, 2024

stephenswat commented Dec 4, 2024

Add new generic matrix operators #137

Add new generic matrix operators #137

Conversation

stephenswat commented Nov 19, 2024 • edited Loading

stephenswat commented Nov 19, 2024 • edited Loading

stephenswat commented Nov 19, 2024

niermann999 commented Nov 19, 2024

stephenswat commented Nov 21, 2024

niermann999 commented Nov 21, 2024

beomki-yeo commented Nov 22, 2024 • edited Loading

stephenswat commented Dec 2, 2024

beomki-yeo commented Dec 3, 2024

stephenswat commented Dec 3, 2024

stephenswat commented Dec 3, 2024

sonarqubecloud bot commented Dec 4, 2024

Quality Gate failed

stephenswat commented Dec 4, 2024

stephenswat commented Nov 19, 2024 •

edited

Loading

stephenswat commented Nov 19, 2024 •

edited

Loading

beomki-yeo commented Nov 22, 2024 •

edited

Loading