Accelerating MATLAB with GPU Computing: A Primer with Examples

Accelerating MATLAB with GPU Computing: A Primer with Examples

Jung W. Suh, Youngmin Kim

Language: English

Pages: 258

ISBN: 0124080804

Format: PDF / Kindle (mobi) / ePub

Beyond simulation and algorithm development, many developers increasingly use MATLAB even for product deployment in computationally heavy fields. This often demands that MATLAB codes run faster by leveraging the distributed parallelism of Graphics Processing Units (GPUs). While MATLAB successfully provides high-level functions as a simulation tool for rapid prototyping, the underlying details and knowledge needed for utilizing GPUs make MATLAB users hesitate to step into it. Accelerating MATLAB with GPUs offers a primer on bridging this gap.

Starting with the basics, setting up MATLAB for CUDA (in Windows, Linux and Mac OS X) and profiling, it then guides users through advanced topics such as CUDA libraries. The authors share their experience developing algorithms using MATLAB, C++ and GPUs for huge datasets, modifying MATLAB codes to better utilize the computational power of GPUs, and integrating them into commercial software products.  Throughout the book, they demonstrate many example codes that can be used as templates of C-MEX and CUDA codes for readers’ projects.  Download example codes from the publisher's website:

  • Shows how to accelerate MATLAB codes through the GPU for parallel processing, with minimal hardware knowledge
  • Explains the related background on hardware, architecture and programming for ease of use
  • Provides simple worked examples of MATLAB and CUDA C codes as well as templates that can be reused in real-world projects












locate where possible bottlenecks are and explains how kernels were invoked in great detail. Optimization Planning through Profiling 55 Figure 3.16 Select the “Profile CUDA Application” as an activity type. In this section, we show how this wonderful tool can be used with MATLAB and CUDA. NVIDIA Visual Profiler can be found at where your CUDA is installed (Figure 3.24). For Windows, it can usually be found at C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v5.0\libnvvp. For Mac OS X, see

MATLAB.exe and click on Attach. Then, Visual Studio shows an empty window with Solution1 (Running) at its top, as in Figure 3.37. Open the source conv2d3 3 3.cpp C-Mex file through File . . . under Open on the File menu in the Visual Studio (Figure 3.38). Next, set a breakpoint in a line, wherever you want, by clicking the right mouse button (Figure 3.39). Then, you can see the inactivated breakpoint and a warning message. But, you can ignore it (Figure 3.40). Once you correctly set the

single"); float* pReal 5 (float*)mxGetPr(prhs[0]); float* pImag 5 (float*)mxGetPi(prhs[0]); mexPrintf("%p: %f\n", pReal, *pReal); mexPrintf("%p: %f\n", pImag, *pImag); } Compile this test c-mex code and run with a sample number. .. mex TestComplex1.cpp .. TestComplex1(single(4 1 5i)) 000000006F2FBB80: 4.000000 000000006F2FC940: 5.000000 (You can test this with our example TestComplex_first.m.) The real part of our sample number is stored at 0x6F2FBB80, while the imaginary part of the number is

actual CUDA calls made with a timeline. We see that, for our matrix-to-matrix calculation, the kernel, gen_sgemmNN_val(. . .), was called (Figure 6.4). Figure 6.1 Running cublasDemo in MATLAB. 138 Accelerating MATLAB with GPU Computing Figure 6.2 Running cublasDemo with Visuaithe Visual Profiler. Let us look at the Details tab. In this tab, you see the actual times it took for the calculation of each CUDA call. Especially, you can get more information about the kernel calls in terms of grid

code as cublasDemo.mexw64, if compiled on Windows 64 bit, for instance. In the MATLAB command window, we call our function to do the multiplication: .. A 5 single(rand(4, 4)); .. B 5 fft2(A) B 5 0.5174 2 0.9100i 9.7001 20.6788 1 0.4372i 1.3043 1.6219 0.5174 1 0.9100i 0.7316 1 0.5521i 20.1909 2 1.0807i 21.3437 1 0.2664i 20.9332 1 0.6233i 22.0830 20.6788 2 0.4372i 21.3437 2 0.2664i 20.1909 1 1.0807i 20.9332 2 0.6233i 0.7316 2 0.5521i .. C 5 cufftDemo(A) C 5 9.7001 2 0.0000i 0.5174 2 0.9100i

Download sample