Reputation: 89
I am trying to use OpenACC in Windows. I am using GCC to compile. (with version 8.1.0)
I found a sample code online using OpenACC.
So using the command prompt, I typed as follows.
"C:\Users\chang>g++ -fopenacc -o C:\Users\chang\source\repos\Project18\Project18\testing.exe C:\Users\chang\source\repos\Project18\Project18\Source1.cpp"
And if I look at Performance in Task manager while the code is running, I don't see any change in GPU usage.
Also if I skip -fopenacc
"C:\Users\chang>g++ -o C:\Users\chang\source\repos\Project18\Project18\testing.exe C:\Users\chang\source\repos\Project18\Project18\Source1.cpp"
There is no difference in speed between with -fopenacc and without.
So I was wondering if there is a prerequisite before I use this OpenACC.
Below is the sample code I found.
Thanks in advance.
P.S As far as I remember, I haven't downloaded openacc.h and tried to find it online but couldn't find where it is. Is this can be a problem? I think since I could run exe file this doesn't seem like a problem but just in case.
/*
* Copyright 2012 NVIDIA Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include <iostream>
#include <math.h>
#include <string.h>
#include <openacc.h>
#include <chrono>
#define NN 4096
#define NM 4096
using namespace std;
using namespace chrono;
double A[NN][NM];
double Anew[NN][NM];
int main(int argc, char** argv)
{
const int n = NN;
const int m = NM;
const int iter_max = 1000;
const double tol = 1.0e-6;
double error = 1.0;
memset(A, 0, n * m * sizeof(double));
memset(Anew, 0, n * m * sizeof(double));
for (int j = 0; j < n; j++)
{
A[j][0] = 1.0;
Anew[j][0] = 1.0;
}
printf("Jacobi relaxation Calculation: %d x %d mesh\n", n, m);
system_clock::time_point start = system_clock::now();
int iter = 0;
#pragma acc data copy(A), create(Anew)
while (error > tol && iter < iter_max)
{
error = 0.0;
#pragma acc kernels
for (int j = 1; j < n - 1; j++)
{
for (int i = 1; i < m - 1; i++)
{
Anew[j][i] = 0.25 * (A[j][i + 1] + A[j][i - 1]
+ A[j - 1][i] + A[j + 1][i]);
error = fmax(error, fabs(Anew[j][i] - A[j][i]));
}
}
#pragma acc kernels
for (int j = 1; j < n - 1; j++)
{
for (int i = 1; i < m - 1; i++)
{
A[j][i] = Anew[j][i];
}
}
if (iter % 100 == 0) printf("%5d, %0.6f\n", iter, error);
iter++;
}
system_clock::time_point end = system_clock::now();
std::chrono::duration<float> sec = end - start;
cout << sec.count() << endl;
}
Upvotes: 0
Views: 426
Reputation: 356
At this time, GCC doesn't support GPU code offloading on Windows. See https://stackoverflow.com/a/59376314/664214, or http://mid.mail-archive.com/[email protected], for example. It's certainly possible to implement, but somebody needs to do it, or pay for the work.
Upvotes: 1