user5739619
user5739619

Reputation: 1838

How std::vectors within loop in OpenACC?

I have the following code

int main(int argc, char** argv )
{
    std::vector< std::vector< std::vector<double> > > vec { 
        {{1,2},{3,4}, {5,6},{7,8}}, 
        {{9,10}, {11,12}}, 
        {{13,14}, {15,16}, {17,18}} };

    #pragma acc parallel loop
    for (int k = 0; k <3; k++) {

        std::vector<std::vector<double>>& vec2d = vec[k];
        int L = vec2d.size();

        //std::vector<int>dVec{67,51,1,0,50};
        std::vector<double>dVec(L, 0.0);

        for (int i = 0; i < L; i++)
        {
            dVec[i] = vec2d[i][1] - vec2d[i][0];
        }

        for (int j=0; j<2; j++) {
            printf("k: %d j: %d vec0: %f, vec1: %f\n", k, j, vec2d[j][0], vec2d[j][1]);
        }
    }
    std::cout<<"finished\n";

    return 0;
}

and I compile with pgc++ -fast -ta=tesla:cuda9.2,managed -o runEx runEx.cpp -std=c++17 && ./runEx

if I comment out the #pragma acc parallel loop, then it works. But if I leave it there, then I get the error

PGCC-S-0155-Procedures called in a compute region must have acc routine information: operator delete (void *) (runEx.cpp: 425)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (runEx.cpp: 6)
PGCC/x86-64 Linux 19.10-0: compilation completed with severe errors

Also, if I comment out the std::vector<int>dVec and the for loop containing it, then the code works even with the #pragma acc parallel loop

However, if I change the loop so it becomes just:

#pragma acc parallel loop
for (int k = 0; k <3; k++) {
    std::vector<int>dVec{67,51,1,0,50};
}

then I get the same error

why is this?

Upvotes: 0

Views: 319

Answers (2)

jefflarkin
jefflarkin

Reputation: 1279

The problem here is that std::vector has some member functions that are not available on the device. The compiler error is specifically calling out delete, but I suspected that the size function will also be problematic, as well as the constructor. Since you don't control the source of std::vector, it's not possible for you to add an acc routine to them. The workaround I've done in the past is to strictly use vector outside of OpenACC regions, and pass a raw pointer to its data into the regions. It's a hassle for sure, especially for large codes, but it works. Otherwise, you might also try implementing your own, minimal vector class that does decorate the member functions with acc routine. I've seen this done successfully as well.

For a very large L, you might be able to get by putting your acc parallel loop solely on your i loop, but you'll be copying data back and forth a lot unless you hoist your arrays outside of the k loop to enable reuse.

Upvotes: 2

Yunfei Chen
Yunfei Chen

Reputation: 626

From the site: https://docs.computecanada.ca/wiki/OpenACC_Tutorial_-_Optimizing_loops

I can see too things:

  1. Your statement:
#pragma acc parallel loop

Is missing parameters.

  1. The site calls the function with: #pragma acc parallel loop present(row_offsets,cols,Acoefs,xcoefs,ycoefs) Which means that the function with 0 parameters is probably illegal.

Upvotes: -1

Related Questions