Reusing a thread variable across iterations with parallelFor #45
Unanswered
andrewmarx
asked this question in
Q&A
Replies: 1 comment 1 reply
-
If I understand correctly, you really need access to the state of individual threads. This is not directly possible with // create a dedicated vector of zeros for every thread
int nThreads = 4;
int sz = 100;
std::vector<VectorXd> xs(nThreads);
for (auto &x : xs)
x = VectorXd::Zero(sz);
auto fun = [&] (unsigned int i) {
// for convenience: short handle for zero vector dedicated to current thread
VectorXd& x = xs[(i * nThreads) / sz];
x(i) = 1;
// do something with x ...
};
parallelFor(0, sz, fun, nThreads, nThreads); // number of threads = number of batches |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
For various reasons, I'm experimenting with swapping out RcppParallel for RcppThread in my package. A key thing to the performance of my package is being able to use reuse the same copy of a variable across iterations of the for loop, with each thread getting its own copy. The reason this is important for me is that the variable is an RcppEigen vector with a length equal to the number of total iterations, and the minimum number of iterations my package needs to support is a million. That's almost 4TB of memory allocations for what is, ironically, essentially a vector of zeros with only a single non-zero value that has its position updated based on the iteration. Ideally, I'd like to make 100 million iterations viable... 😬 So in RcppParallel, I could initialize a copy of this variable in the worker, then in the for loop simply update which element had the non-zero value, avoiding an excessive amount of memory allocations just to change 8 bytes.
My question is, is it possible to do something like this with
parallelFor()
? Essentially, have each thread get a copy of this vector that they reuse across iterations?If not, it looks like I could conceivably do this by getting creative with the thread pools, but
parallelFor()
is so much nicer to use.Beta Was this translation helpful? Give feedback.
All reactions