user2467731
user2467731

Reputation: 191

Error compiling a cuda project

I'm having some trouble compiling a cuda project with C Cuda and the lodepng libraries.

My makefile looks like this.

gpu:    super-resolution.cu
    gcc -g -O -c lodepng.c
    nvcc -c super-resolution.cu
    nvcc -o super-resolution-cuda super-resolution.o 
    rm -rf super-resolution.o
    rm -rf lodepng.o

Could anyone tell me what I am doing wrong, because it is complaining about

nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.o: In function `main':
parallel-algorithm/super-resolution.cu:238: undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)'
parallel-algorithm/super-resolution.cu:259: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:269: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:282: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:292: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
parallel-algorithm/super-resolution.cu:301: undefined reference to `lodepng_encode32_file(char const*, unsigned char const*, unsigned int, unsigned int)'
...

I just need a way to compile my .cu file and add a C .o file into it during the compilation process using nvcc.

EDIT: tried suggestion. no success.

gcc -g -O -c lodepng.c
nvcc -c super-resolution.cu
nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated]
 #import "cuda.h"
  ^
super-resolution.cu(106): warning: expression has no effect

super-resolution.cu(116): warning: expression has no effect

super-resolution.cu(141): warning: variable "y" was declared but never referenced

super-resolution.cu:1:2: warning: #import is a deprecated GCC extension [-Wdeprecated]
 #import "cuda.h"
  ^
super-resolution.cu(106): warning: expression has no effect

super-resolution.cu(116): warning: expression has no effect

super-resolution.cu(141): warning: variable "y" was declared but never referenced

ptxas /tmp/tmpxft_00000851_00000000-5_super-resolution.ptx, line 197; warning : Double is not supported. Demoting to float
nvcc -o super-resolution-cuda super-resolution.o lodepng.o

nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
super-resolution.o: In function `main':
tmpxft_00000851_00000000-3_super-resolution.cudafe1.cpp:(.text+0x5d): undefined reference to `lodepng_decode32_file(unsigned char**, unsigned int*, unsigned int*, char const*)'

It still can't find the reference to the object file. Edit: here's our .cu file.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cstdio>

extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );

Upvotes: 0

Views: 5571

Answers (2)

user2467731
user2467731

Reputation: 191

Here is a simple snippet of the code.

The lodepng library can be gotten from here (http://lodev.org/lodepng/).

Renaming it to C will make it usable on C.

Even at this level, there's compilation issues with

"undefined reference to `lodepng_decode32_file'"
"undefined reference to `lodepng_encode32_file'"

File: Makefile

all:    gpu
    gcc -g -O -c lodepng.c
    nvcc -c super-resolution.cu
    nvcc -o super-resolution-cuda super-resolution.o lodepng.o
    rm -rf super-resolution.o
    rm -rf lodepng.o

File: super-resolution.cu

#import "cuda.h"
#include "lodepng.h"

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cstdio>

extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );

//GPU 3x3 Blur.
__global__ void gpuBlur(unsigned char* image, unsigned char* buffer, int width, int height)
{
    int i = threadIdx.x%width;
    int j = threadIdx.x/width;
    if (i == 0 || j == 0 || i == width - 1 || j == height - 1)
        return;

    int k;
    for (k = 0; k <= 4; k++)
    {
        buffer[4*width*j + 4*i + k] =           (image[4*width*(j-1) + 4*(i-1) + k] +
                                image[4*width*(j-1) + 4*i + k] +
                                image[4*width*(j-1) + 4*(i+1) + k] +
                                image[4*width*j + 4*(i-1) + k] +
                                image[4*width*j + 4*i + k] +
                                image[4*width*j + 4*(i+1) + k] +
                                image[4*width*(j+1) + 4*(i-1) + k] +
                                image[4*width*(j+1) + 4*i + k] +
                                image[4*width*(j+1) + 4*(i+1) + k])/9;
    }
}

int main(int argc, char *argv[])
{
    //Items for image processing;
    //int threshold = 100;
    unsigned int error;
    unsigned char* image;
    unsigned int width, height;

    //Load the image;
    if (argc > 1)
    {
        error = lodepng_decode32_file(&image, &width, &height, argv[1]);
        printf("Loaded file: %s[%d]\n", argv[1], error);
    }
    else
    {

        return 0;
    }

    unsigned char* buffer =(unsigned char*)malloc(sizeof(char) * 4*width*height);

    //GPU Blur Section.
    unsigned char* image_gpu;
    unsigned char* blur_gpu;
    cudaMalloc( (void**) &image_gpu, sizeof(char) * 4*width*height);
    cudaMalloc( (void**) &blur_gpu, sizeof(char) * 4*width*height);
    cudaMemcpy(image_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice);
    cudaMemcpy(blur_gpu,image, sizeof(char) * 4*width*height, cudaMemcpyHostToDevice);
    gpuBlur<<< 1, height*width >>> (image_gpu, blur_gpu, width, height);
    cudaMemcpy(buffer, blur_gpu, sizeof(char) * 4*width*height, cudaMemcpyDeviceToHost);
    //Spit out buffer as an image.
    error = lodepng_encode32_file("GPU_OUTPUT1_Blur.png", buffer, width, height);
    cudaFree(image_gpu);
    cudaFree(blur_gpu);

    free(buffer);
    free(image);

}

Upvotes: 1

Robert Crovella
Robert Crovella

Reputation: 152173

  1. don't #import. If you want to include cuda.h (which should be unnecessary) then use #include. Instead I would just delete that line from your super-resolution.cu file.
  2. What you did not show before, but is now evident, is that in your super-resolution.cu you are including lodepng.h and also later specifying C-linkage for 2 functions: lodepng_decode32_file and lodepng_encode32_file. When I tried compiling your super-resolution.cu the compiler gave me errors like this (I don't know why you don't see them):

    super-resolution.cu(8): error: linkage specification is incompatible with previous "lodepng_encode32_file"
    lodepng.h(184): here
    
    super-resolution.cu(9): error: linkage specification is incompatible with previous "lodepng_decode32_file"
    lodepng.h(134): here
    

    So basically you are tripping over C and C++ linkage.

  3. I believe the simplest solution is to use lodepng.cpp (instead of lodepng.c), delete the following lines from your super-resolution.cu:

    extern "C" unsigned lodepng_encode32_file(const char* ,const unsigned char* , unsigned , unsigned h);
    extern "C" unsigned lodepng_decode32_file(unsigned char** , unsigned* , unsigned* ,const char* );
    

    And just compile everything and link everything c++ style:

    $ g++ -c lodepng.cpp
    $ nvcc -c super-resolution.cu
    nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
    $ nvcc -o super-resolution super-resolution.o lodepng.o
    nvcc warning : The 'compute_10' and 'sm_10' architectures are deprecated, and may be removed in a future release.
    $
    
  4. If you really want to link lodepng.o c-style instead of c++ style, then you will need to modify lodepng.h with appropriate extern "C" wrappers where the necessary functions are called out. In my opinion this gets messy.

  5. If you want to get rid of the warnings about sm_10 then add the nvcc switch to compile for a different architecture, e.g.:

    nvcc -arch=sm_20 ...
    

    but make sure whatever you choose is compatible with your GPU.

Upvotes: 2

Related Questions