Skip to content

Multiplication of 2 matrix using shared memory and CUDA in order to study times and speedup.

Notifications You must be signed in to change notification settings

AlbertoSoutullo/CUDASharedMemoryMatrixMul

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUDASharedMemoryMatrixMul

Matrix structure used:
We used .bin files with raw numbers, being the first one and the second one the number of rows and the number of columns respectively.

In order to creathe those matrix as easy as possible, a .cpp file is added. In it we can create 2 matrix, the first one with random numbers and a given size, and the second one will be an identity matrix, with a given size aswell.

Kernel:
This kernel was used to study diferent computation times with diferent matrix sizes.
The multiplication is done as in the following image:

Multiplying 2 matrix with size of 10000x10000, we obtained the following results:

Trying to do a secuencial multiplication with a simple FOR loop:
-Unable to computate

Doing an 8 threads static division multiplication:
-1595'099 sec

Using CUDA with shared memory:
-18'914 sec

With CUDA we obtained a speedup of 84'334302 compared with static division.

The results were obtained with an Intel Xenon and a nVidia GTX 560.

About

Multiplication of 2 matrix using shared memory and CUDA in order to study times and speedup.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published