user3237120
user3237120

Reputation: 3

Unenhanced performance of matlab GPU computing

With the intention of comparing the speed of GPU vs CPU computing, I ran the example codes available here (a Mandelbrot set on the GPU) from MATLAB central. Below are the results that I obtained:

As can be seen, there is no improvement in case 2 and only slight improvement of around 4 times in case 3. As the example did not state the details of GPU they used, may I know if this is simply due to the "incompetency" of my graphic card or am I missing something important?

The graphic card is also responsible for driving my display (HP Z Display Z23i 23-inch IPS LED Backlit Monitor).

CPU: Intel i7-4790, 3.6 GHz (8 cores)

GPU:

                  Name: 'NVS 510'
                 Index: 1
     ComputeCapability: '3.0'
        SupportsDouble: 1
         DriverVersion: 6
        ToolkitVersion: 5
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 2.1475e+09
            FreeMemory: 1.6934e+09
   MultiprocessorCount: 1
          ClockRateKHz: 797000
           ComputeMode: 'Default'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

Thank you!

Edit

The GPU used in the example here is Tesla C2050. (Credits to @Sam Roberts)

Upvotes: 0

Views: 722

Answers (1)

rayryeng
rayryeng

Reputation: 104464

The times on that link are most likely for a different GPU in comparison to yours. They don't specify what kind of graphics card they're using, but my guess is that they're using a more higher end card.

By Googling NVS 510, the specs are similar to the card that I have for my machine. However, your card is geared towards business while mine is geared towards gaming. I have a GTX 660 which is one of the higher end GPUs that are available on the market.

These are the attributes of my graphics card:

CUDADevice with properties:

                  Name: 'GeForce GTX 660'
                 Index: 1
     ComputeCapability: '3.0'
        SupportsDouble: 1
         DriverVersion: 6.5000
        ToolkitVersion: 5.5000
    MaxThreadsPerBlock: 1024
      MaxShmemPerBlock: 49152
    MaxThreadBlockSize: [1024 1024 64]
           MaxGridSize: [2.1475e+09 65535 65535]
             SIMDWidth: 32
           TotalMemory: 2.1475e+09
            FreeMemory: 1.5357e+09
   MultiprocessorCount: 5
          ClockRateKHz: 1084500
           ComputeMode: 'Default'
  GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
      CanMapHostMemory: 1
       DeviceSupported: 1
        DeviceSelected: 1

The differences between my card and yours are that I have 5 multiprocessors, and my clock rate is about 300 MHz faster than yours. For a side-by-side comparison, check out my card in comparison to yours:

Upon further inspection, I have a much higher memory bandwidth than your card. I also have 960 GPU cores in comparison to your 192.

I decided to run these tests to compare my performance with your timings. My CPU is an i7-4770 3.6 GHz Intel and I have 16 GB of RAM on my machine.

The times that I get by running those examples are the following:

  • Case #1 - Without GPU: 6.46 seconds
  • Case #2 - Naive GPU: 0.82 seconds - 7.9x faster
  • Case #3 - Through CUDA: 0.09 seconds - 71.7x faster

With this, my guess is that your graphics card may be of a lower quality in comparison to those tests that MathWorks performed. Maybe try updating your graphics drivers and see if that helps. However, my guess is that my performance is much better due to the multiprocessor count, faster clock, a higher amount of cores and higher memory bandwidth.

Upvotes: 2

Related Questions