Reputation: 13
I'm trying to allocate and initialize a 2D array using cudaMallocPitch() and cudaMemcpy2D(). I have been able to allocate several arrays using the previous APIs however there is a particular array that keeps causing my program to seg fault.
My code is,
int size = totalPat * trainingSize * wordSize; // 65 * 672 * 15
char ** h_pattern = (char**) malloc((size_t) 40 * sizeof(char));
for(int = 0; i < 40; i++){
h_pattern[i] = (char*) malloc((size_t) size * sizeof(char));
fill_n(h_pattern[i], size, '\0');
}
char * d_pattern;
size_t dpitch;
size_t spitch = size * sizeof(char);
cudaMallocPitch(&d_patterns, &dpitch, spitch, 40));
cudaMemcpy2D(d_pattern, dpitch, h_pattern, spitch, spitch, 40, cudaMemcpyHostToDevice);
I used cuda-gdb to debug my program and locate the problem and it keeps seg faulting in cudaMemcpy2D(). Backtrace gives the following output,
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff501dd00 in cudbgGetAPIVersion () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(cuda-gdb) backtrace
#0 0x00007ffff501dd00 in cudbgGetAPIVersion () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1 0x00007ffff4efc68e in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007ffff4f0cc7f in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007ffff4efd7f1 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007ffff4e6b322 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007ffff4e74b38 in cuMemGetAttribute_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007ffff4e4d92a in cuMemcpy2DUnaligned_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7 0x000000000045bc5d in cudart::driverHelper::memcpy2DPtr(char*, unsigned long, char const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind, CUstream_st*, bool, bool) ()
#8 0x0000000000435039 in cudart::cudaApiMemcpy2DCommon(void*, unsigned long, void const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind, bool) ()
#9 0x00000000004350f8 in cudart::cudaApiMemcpy2D(void*, unsigned long, void const*, unsigned long, unsigned long, unsigned long, cudaMemcpyKind) ()
#10 0x0000000000462073 in cudaMemcpy2D ()
In devtalk forum there was question regarding pitch limits where cudaMemcpy2D() failed with pitch size greater than 2^18 however this question was from 2007 and I would assume this limit no longer exists. Also in the documentation there is a mention that if dpitch or spitch exceeds the maximum allowed cudaMemcpy2D() returns an error but they don't tell what is the maximum allowed.
Any help is greatly appreciated.
Upvotes: 1
Views: 663
Reputation: 9771
You code is trying to copy 40 * size
bytes data of the type char
to a 40-byte host memory space of the type char*
.
Instead you need to malloc a linear memory space for all 40 patterns on the host like:
char* h_pattern;
h_pattern = (char*) malloc((size_t) 40* size * sizeof(char));
Upvotes: 1