Skip to content

leo6liu/CUDA-VectorAdd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUDA VectorAdd

An example program which detects number of CUDA compute-capable devices, generates two float vectors of length 2^28 with values[0,1], adds them using the available CUDA compute-capable devices, and compares the result with a CPU solution by calculating the total error.

 $ ./add --help
 CUDA VectorAdd -- a program which scales across multiple GPUs to perform vector
 addition

   -b, --blocks-per-gpu=LEN   Specify the number of blocks per GPU (defaults to
                              either the the number of elements each GPU needs
			      to process divided by the threads per block or the
			      GPU's maxGridSize[0])
   -N, --vector-length=LEN    Specify the vector lengths (defaults to 2^28)
   -t, --threads-per-block=LEN   Specify the number of threads per GPU block
                                     (defaults to maxThreadsPerBlock of GPU)
   -v, --verbose              Explains what is being done
   -?, --help                 Give this help list
       --usage                Give a short usage message
   -V, --version              Print program version
 Mandatory or optional arguments to long options are also mandatory or optional
 for any corresponding short options.
 
 Report bugs to <[email protected]>.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published