C++ AMP crashing on hardware (GeForce GTX 660)

Question

I’m having a problem writing some C++ AMP code. I have included a sample. It runs fine on emulated accelerators but crashes the display driver on my hardware (windows 7, NVIDIA GeForce GTX 660, latest drivers) but I can see nothing on wrong with my code.

Is there a problem with my code or is this a hardware/driver/complier issue?

#include "stdafx.h"

#include 
#include 
#include 

int _tmain(int argc, _TCHAR* argv[])
{
    // Prints "NVIDIA GeForce GTX 660"
    concurrency::accelerator_view target_view = concurrency::accelerator().create_view();
    std::wcout << target_view.accelerator.description << std::endl;

    // lower numbers do not cause the issue
    const int x = 2000;
    const int y = 30000;

    // 1d array for storing result
    std::vector resultVector(y);
    Concurrency::array_view resultsArrayView(resultVector.size(), resultVector);

    // 2d array for data for processing 
    std::vector dataVector(x * y);
    concurrency::array_view dataArrayView(y, x, dataVector);
    parallel_for_each(
        // Define the compute domain, which is the set of threads that are created.
        resultsArrayView.extent,
        // Define the code to run on each thread on the accelerator.
        [=](concurrency::index<1> idx) restrict(amp)
    {
        concurrency::array_view buffer = dataArrayView[idx[0]];
        unsigned int bufferSize = buffer.get_extent().size();

        // needs both loops to cause crash
        for (unsigned int outer = 0; outer < bufferSize; outer++)
        {
            for (unsigned int i = 0; i < bufferSize; i++)
            {
                // works without this line, also if I change to buffer[0] it works?
                dataArrayView[idx[0]][0] = 0;
            }
        }
        // works without this line
        resultsArrayView[0] = 0;
    });

    std::cout << "chash on next line" << std::endl; 
    resultsArrayView.synchronize();
    std::cout << "will never reach me" << std::endl; 

    system("PAUSE");
    return 0;
}

Szymon Wybranski · Accepted Answer

It is very likely that your computation exceeds permitted quantum time (default 2 seconds). After that time the operating systems comes in and restarts the GPU forcefully, this is called Timeout Detection and Recovery (TDR). The software adapter (reference device) does not have the TDR enabled, that is why the computation can exceed permitted quantum time.

Does your computation really require 3000 threads (variable x), each performing 2000 * 3000 (x * y) loop iterations? You can chunk your computation, such that each chunks takes less than 2 seconds to compute. You can also consider disabling TDR or exceeding the permitted quantum time to fit your need.

I highly recommend reading a blog post on how to handle TDRs in C++ AMP, which explains TDR in details: http://blogs.msdn.com/b/nativeconcurrency/archive/2012/03/07/handling-tdrs-in-c-amp.aspx

Additionally, here is the separate blog post on how to disable the TDR on Windows 8: http://blogs.msdn.com/b/nativeconcurrency/archive/2012/03/06/disabling-tdr-on-windows-8-for-your-c-amp-algorithms.aspx

C++ AMP crashing on hardware (GeForce GTX 660)

Answers (1)

Related Questions