Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static. More...
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations. More...
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each enumeration value. If no schedule is specified, the default is Schedule.Static. More...
Contains relevant internal information about parallel regions, including the threads and the function to be executed. Provides a region-wide lock and SpinWait objects for each thread
Encapsulates a Thread object with information about its progress through a parallel for loop. For keeping track of its progress through a parallel for loop, we keep track of the current next iteration of the loop to be worked on, and the iteration the current thread is currently working on
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A shared variable that can be used in a parallel region. This allows for a variable to be declared inside of a parallel region that is shared among all threads, which has some nice use cases
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A shared variable that can be used in a parallel region. This allows for a variable to be declared inside of a parallel region that is shared among all threads, which has some nice use cases
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads. More...
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size. More...
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size. More...
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3". More...
+
+
+Generated by 1.9.1
+
+
+
diff --git a/classDotMP_1_1DynamicScheduler__coll__graph.map b/classDotMP_1_1DynamicScheduler__coll__graph.map
new file mode 100644
index 00000000..5095dfaa
--- /dev/null
+++ b/classDotMP_1_1DynamicScheduler__coll__graph.map
@@ -0,0 +1,8 @@
+
diff --git a/classDotMP_1_1DynamicScheduler__coll__graph.md5 b/classDotMP_1_1DynamicScheduler__coll__graph.md5
new file mode 100644
index 00000000..4d5490e0
--- /dev/null
+++ b/classDotMP_1_1DynamicScheduler__coll__graph.md5
@@ -0,0 +1 @@
+c4e04509fc4a444bcc3408fa9c42cfac
\ No newline at end of file
diff --git a/classDotMP_1_1DynamicScheduler__coll__graph.png b/classDotMP_1_1DynamicScheduler__coll__graph.png
new file mode 100644
index 00000000..96fb1570
Binary files /dev/null and b/classDotMP_1_1DynamicScheduler__coll__graph.png differ
diff --git a/classDotMP_1_1DynamicScheduler__inherit__graph.map b/classDotMP_1_1DynamicScheduler__inherit__graph.map
new file mode 100644
index 00000000..a0d0ff81
--- /dev/null
+++ b/classDotMP_1_1DynamicScheduler__inherit__graph.map
@@ -0,0 +1,5 @@
+
diff --git a/classDotMP_1_1DynamicScheduler__inherit__graph.md5 b/classDotMP_1_1DynamicScheduler__inherit__graph.md5
new file mode 100644
index 00000000..a1fac794
--- /dev/null
+++ b/classDotMP_1_1DynamicScheduler__inherit__graph.md5
@@ -0,0 +1 @@
+6534748a894947b4b22ed135111b4309
\ No newline at end of file
diff --git a/classDotMP_1_1DynamicScheduler__inherit__graph.png b/classDotMP_1_1DynamicScheduler__inherit__graph.png
new file mode 100644
index 00000000..963b0ae7
Binary files /dev/null and b/classDotMP_1_1DynamicScheduler__inherit__graph.png differ
diff --git a/classDotMP_1_1ForAction.html b/classDotMP_1_1ForAction.html
index 4b0deb1a..34053361 100644
--- a/classDotMP_1_1ForAction.html
+++ b/classDotMP_1_1ForAction.html
@@ -76,7 +76,7 @@
-
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>.
+
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>.
+
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>.
Template Parameters
T
The type of the reduction callback.
diff --git a/classDotMP_1_1GuidedScheduler-members.html b/classDotMP_1_1GuidedScheduler-members.html
new file mode 100644
index 00000000..7959c99a
--- /dev/null
+++ b/classDotMP_1_1GuidedScheduler-members.html
@@ -0,0 +1,96 @@
+
+
+
+
+
+
+
+DotMP: Member List
+
+
+
+
+
+
+
+
+
+
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads. More...
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size. More...
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size. More...
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3". More...
Calculates the next chunk of iterations for a static scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.StaticLoop<T> documentation. More...
Calculates the next chunk of iterations for a dynamic scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.DynamicLoop<T> documentation. More...
Calculates the next chunk of iterations for a guided scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.GuidedLoop<T> documentation. More...
Calculates the next chunk of iterations for a dynamic scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.DynamicLoop<T> documentation.
Calculates the next chunk of iterations for a guided scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.GuidedLoop<T> documentation.
Calculates the next chunk of iterations for a static scheduling parallel for loop and executes the appropriate function. Each time this function is called, the calling thread receives a chunk of iterations to work on, as specified in the Iter.StaticLoop<T> documentation.
+
Performs a parallel for loop according to the scheduling policy provided.
For (int start, int end, Action< int > action, Schedule schedule=Schedule.Static, uint? chunk_size=null)
-
Creates a for loop inside a parallel region. A for loop created with For inside of a parallel region is executed in parallel, with iterations being distributed among the threads, and potentially out-of-order. A schedule is provided to inform the runtime how to distribute iterations of the loop to threads. Available schedules are specified by the Schedule enum, and have detailed documentation in the Iter class. Acts as an implicit Barrier(). More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a for loop inside a parallel region with a reduction. This is similar to For(), but the reduction allows multiple threads to reduce their work down to a single variable. Using ForReduction<T> allows the runtime to perform this operation much more efficiently than a naive approach using the Locking or Atomic classes. Each thread gets a thread-local version of the reduction variable, and the runtime performs a global reduction at the end of the loop. Since the global reduction only involves as many variables as there are threads, it is much more efficient than a naive approach. Acts as an implicit Barrier(). More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
-
+
static void
For (int start, int end, Action< int > action, IScheduler schedule=null, uint? chunk_size=null)
+
Creates a for loop inside a parallel region. A for loop created with For inside of a parallel region is executed in parallel, with iterations being distributed among the threads, and potentially out-of-order. A schedule is provided to inform the runtime how to distribute iterations of the loop to threads. Available schedules are specified by the Schedule enum, and have detailed documentation in the Iter class. Acts as an implicit Barrier(). More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a for loop inside a parallel region with a reduction. This is similar to For(), but the reduction allows multiple threads to reduce their work down to a single variable. Using ForReduction<T> allows the runtime to perform this operation much more efficiently than a naive approach using the Locking or Atomic classes. Each thread gets a thread-local version of the reduction variable, and the runtime performs a global reduction at the end of the loop. Since the global reduction only involves as many variables as there are threads, it is much more efficient than a naive approach. Acts as an implicit Barrier(). More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
+
+
static void
ForReductionCollapse< T > ((int, int) firstRange,(int, int) secondRange,(int, int) thirdRange, Operations op, ref T reduce_to, ActionRef3< T > action, IScheduler schedule=null, uint? chunk_size=null)
+
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
+
+
static void
ForReductionCollapse< T > ((int, int) firstRange,(int, int) secondRange,(int, int) thirdRange,(int, int) fourthRange, Operations op, ref T reduce_to, ActionRef4< T > action, IScheduler schedule=null, uint? chunk_size=null)
+
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a collapsed reduction for loop inside a parallel region. A collapsed for loop can be used when you want to parallelize two or more nested for loops. Instead of only parallelizing across the outermost loop, the nested loops are flattened before scheduling, which has the effect of parallelizing across both loops. This has the effect multiplying the number of iterations the scheduler can work with, which can improve load balancing in irregular nested loops. More...
Creates a parallel region. The body of a parallel region is executed by as many threads as specified by the num_threads parameter. If the num_threads parameter is absent, then the runtime checks if SetNumThreads has been called. If so, it will use that many threads. If not, the runtime will try to use as many threads as there are logical processors. More...
Creates a parallel for loop. Contains all of the parameters from ParallelRegion() and For(). This is simply a convenience method for creating a parallel region and a for loop inside of it. More...
Creates a parallel for loop with a reduction. Contains all of the parameters from ParallelRegion() and ForReduction<T>(). This is simply a convenience method for creating a parallel region and a for loop with a reduction inside of it. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel for loop. Contains all of the parameters from ParallelRegion() and For(). This is simply a convenience method for creating a parallel region and a for loop inside of it. More...
Creates a parallel for loop with a reduction. Contains all of the parameters from ParallelRegion() and ForReduction<T>(). This is simply a convenience method for creating a parallel region and a for loop with a reduction inside of it. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a parallel collapsed reduction for loop. Contains all of the parameters from ParallelRegion() and ForReductionCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop with a reduction inside of it. More...
Creates a sections region. Sections allows for the user to submit multiple, individual tasks to be distributed among threads in parallel. In parallel, each thread active will dequeue a callback and execute it. This is useful if you have lots of individual tasks that need to be executed in parallel, and each task requires its own lambda. Acts as an implicit Barrier(). More...
@@ -183,7 +183,7 @@
Creates a single region. A single region is only executed once per Parallel.ParallelRegion. The first thread to encounter the single region marks the region as encountered, then executes it. More...
Creates an ordered region. An ordered region is a region of code that is executed in order inside of a For() or ForReduction<T>() loop. This also acts as an implicit Critical() region. More...
+
Creates an ordered region. An ordered region is a region of code that is executed in order inside of a For() or ForReduction<T>() loop. This also acts as an implicit Critical() region. More...
Fixes the arguments for a parallel for loop. If a Schedule is set to Static, Dynamic, or Guided, then the function simply calculates chunk size if none is given. If a Schedule is set to Runtime, then the function checks the OMP_SCHEDULE environment variable and sets the appropriate values. More...
Fixes the arguments for a parallel for loop. If a Schedule is set to Static, Dynamic, or Guided, then the function simply calculates chunk size if none is given. If a Schedule is set to Runtime, then the function checks the OMP_SCHEDULE environment variable and sets the appropriate values. More...
Fixes the arguments for a parallel for loop. If a Schedule is set to Static, Dynamic, or Guided, then the function simply calculates chunk size if none is given. If a Schedule is set to Runtime, then the function checks the OMP_SCHEDULE environment variable and sets the appropriate values.
Creates a for loop inside a parallel region. A for loop created with For inside of a parallel region is executed in parallel, with iterations being distributed among the threads, and potentially out-of-order. A schedule is provided to inform the runtime how to distribute iterations of the loop to threads. Available schedules are specified by the Schedule enum, and have detailed documentation in the Iter class. Acts as an implicit Barrier().
Creates a for loop inside a parallel region with a reduction. This is similar to For(), but the reduction allows multiple threads to reduce their work down to a single variable. Using ForReduction<T> allows the runtime to perform this operation much more efficiently than a naive approach using the Locking or Atomic classes. Each thread gets a thread-local version of the reduction variable, and the runtime performs a global reduction at the end of the loop. Since the global reduction only involves as many variables as there are threads, it is much more efficient than a naive approach. Acts as an implicit Barrier().
+
Creates a for loop inside a parallel region with a reduction. This is similar to For(), but the reduction allows multiple threads to reduce their work down to a single variable. Using ForReduction<T> allows the runtime to perform this operation much more efficiently than a naive approach using the Locking or Atomic classes. Each thread gets a thread-local version of the reduction variable, and the runtime performs a global reduction at the end of the loop. Since the global reduction only involves as many variables as there are threads, it is much more efficient than a naive approach. Acts as an implicit Barrier().
Creates an ordered region. An ordered region is a region of code that is executed in order inside of a For() or ForReduction<T>() loop. This also acts as an implicit Critical() region.
Parameters
id
The ID of the ordered region. Must be unique per region but consistent across all threads.
Creates a parallel for loop. Contains all of the parameters from ParallelRegion() and For(). This is simply a convenience method for creating a parallel region and a for loop inside of it.
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop.
Parameters
firstRange
A tuple representing the start and end of the first for loop.
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop.
Parameters
firstRange
A tuple representing the start and end of the first for loop.
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop.
Parameters
firstRange
A tuple representing the start and end of the first for loop.
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCollapse(). This is simply a convenience method for creating a parallel region and a collapsed for loop.
Parameters
ranges
A tuple representing the start and end of each of the for loops.
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations.
+ More...
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads. More...
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size. More...
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size. More...
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3". More...
+
+
+
Detailed Description
+
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations.
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static.
+ More...
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads. More...
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size. More...
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size. More...
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3". More...
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static.
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size.
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size.
+
Pros:
+
Adaptable to workloads.
+
+
Cons:
+
Might not handle loops with early heavy load imbalance efficiently.
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3".
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads.
+
Pros:
+
Reduced overhead.
+
+
Cons:
+
Potential for load imbalance.
+
+
Note: This is the default strategy if none is chosen.
+
+
+
+The documentation for this class was generated from the following file:
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads. More...
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size. More...
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size. More...
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3". More...
Booleans per-thread to check if we're currently in a Parallel.For or Parallel.ForReduction<T>. More...
+
Booleans per-thread to check if we're currently in a Parallel.For or Parallel.ForReduction<T>. More...
Detailed Description
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work.
Provides atomic operations for integral types as a wrapper around the Interlocked class. Adds support for signed and unsigned 32- and 64-bit integers. Supports addition, subtraction (for signed types), increment, decrement, bitwise And, and bitwise Or
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>
A shared variable that can be used in a parallel region. This allows for a variable to be declared inside of a parallel region that is shared among all threads, which has some nice use cases
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Contains relevant internal information about parallel regions, including the threads and the function to be executed. Provides a region-wide lock and SpinWait objects for each thread
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Encapsulates a Thread object with information about its progress through a parallel for loop. For keeping track of its progress through a parallel for loop, we keep track of the current next iteration of the loop to be worked on, and the iteration the current thread is currently working on
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Contains relevant internal information about parallel regions, including the threads and the function to be executed. Provides a region-wide lock and SpinWait objects for each thread
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Encapsulates a Thread object with information about its progress through a parallel for loop. For keeping track of its progress through a parallel for loop, we keep track of the current next iteration of the loop to be worked on, and the iteration the current thread is currently working on
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work
This function supports the num_threads optional parameter, which sets the number of threads to spawn. The default value is the number of logical threads on the system.
This function supports the chunk_size optional parameter, which sets the chunk size for the scheduler to use. The default value is dependent on the scheduler and is not documented, as it may change from version to version.
-
The behavior of DotMP.Parallel.For is undefined if not used within a ParallelRegion.
+
The behavior of DotMP.Parallel.For is undefined if not used within a ParallelRegion.
Creates a collapsed for loop inside a parallel region. A collapsed for loop can be used when you want...
Definition: Parallel.cs:183
If four or fewer loops are being collapsed, overloads of ForCollapse exist to easily collapse said loops. If greater than four loops are being collapsed, then the user should pass an array of tuples as the first argument, and accept an array of indices in the lambda.
This function supports all of the optional parameters of For.
Creates a parallel collapsed for loop. Contains all of the parameters from ParallelRegion() and ForCo...
Definition: Parallel.cs:634
This function supports all of the optional parameters of ParallelRegion and ForCollapse, and is merely a wrapper around those two functions for conciseness.
Creates a critical region. A critical region is a region of code that can only be executed by one thr...
Definition: Parallel.cs:1019
This function requires an id parameter, which is used as a unique identifier for a particular critical region. If multiple critical regions are present in the code, they should each have a unique id. The id should likely be a const int or an integer literal.
Enqueue a task into the task queue. Differing from OpenMP, there is no concept of parent or child tas...
Definition: Parallel.cs:851
This function supports depends as a params parameter. depends accepts DotMP.TaskUUID objects, and marks the created task as dependent on the tasks passed through depends.
This function adds a task to the task queue and is deferred until a tasking point.
This function returns a DotMP.TaskUUID object, which can be passed to future depends clauses.
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work. More...
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static. More...
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations. More...
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and AtomicMore...
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation. More...
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each enumeration value. If no schedule is specified, the default is Schedule.Static. More...
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each enumeration value. If no schedule is specified, the default is Schedule.Static.
-
-
Enumerator
Static
The static scheduling strategy. Iterations are divided amongst threads in round-robin fashion. Each thread gets a 'chunk' of iterations, determined by the chunk size. If no chunk size is specified, it's computed as total iterations divided by number of threads.
-
Pros:
-
Reduced overhead.
-
-
Cons:
-
Potential for load imbalance.
-
-
Note: This is the default strategy if none is chosen.
-
-
Dynamic
The dynamic scheduling strategy. Iterations are managed in a central queue. Threads fetch chunks of iterations from this queue when they have no assigned work. If no chunk size is defined, a basic heuristic is used to determine a chunk size.
-
Pros:
-
Better load balancing.
-
-
Cons:
-
Increased overhead.
-
-
-
Guided
The guided scheduling strategy. Similar to dynamic, but the chunk size starts larger and shrinks as iterations are consumed. The shrinking formula is based on the remaining iterations divided by the number of threads. The chunk size parameter sets a minimum chunk size.
-
Pros:
-
Adaptable to workloads.
-
-
Cons:
-
Might not handle loops with early heavy load imbalance efficiently.
-
-
-
Runtime
Runtime-defined scheduling strategy. Schedule is determined by the 'OMP_SCHEDULE' environment variable. Expected format: "schedule[,chunk_size]", e.g., "static,128", "guided", or "dynamic,3".
Contains relevant internal information about parallel regions, including the threads and the function to be executed. Provides a region-wide lock and SpinWait objects for each thread
Encapsulates a Thread object with information about its progress through a parallel for loop. For keeping track of its progress through a parallel for loop, we keep track of the current next iteration of the loop to be worked on, and the iteration the current thread is currently working on
Contains all relevant information about a parallel for loop. Contains a collection of Thr objects, the loop's start and end iterations, the chunk size, the number of threads, and the number of threads that have completed their work
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A shared variable that can be used in a parallel region. This allows for a variable to be declared inside of a parallel region that is shared among all threads, which has some nice use cases
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>
Represents the various scheduling strategies for parallel for loops. Detailed explanations of each scheduling strategy are provided alongside each getter. If no schedule is specified, the default is Schedule.Static
Placeholder for the runtime scheduler. Is not meant to be called directly. The Parallel.FixArgs method should detect its existence and swap it out for another scheduler with implementations
The main class of DotMP. Contains all the main methods for parallelism. For users, this is the main class you want to worry about, along with Lock, Shared, and Atomic
Static class that contains necessary information for sections. Sections allow for the user to submit multiple actions to be executed in parallel. A sections region contains a collection of actions to be executed, specified as Parallel.Section directives. More information can be found in the Parallel.Sections documentation
A shared variable that can be used in a parallel region. This allows for a variable to be declared inside of a parallel region that is shared among all threads, which has some nice use cases
A specialization of Shared for items that can be indexed with square brackets. The DotMP-parallelized Conjugate Gradient example shows this off fairly well inside of the SpMV function
Class encapsulating all of the possible callbacks in a Parallel.For-style loop. This includes Parallel.For, Parallel.ForReduction<T>, Parallel.ForCollapse, and Parallel.ForReductionCollapse<T>