Reputation: 680
We're trying to use OpenCL for some image processing on IMX.6.
We used a already-tested opencl code. In the kernel.cl file, the only opencl thing is
int i= get_global_id(0);
int j= get_global_id(1);
All other works are based on pure-c language instead of opencl.
And the code runs well on the PC.
However, when we test the code on IMX.6. All of the status shows correct, but we cannot have the correct result.
The read and write buffer function clEnqueueReadBuffer has no problem at all, we tested the uploaded image. BUT the kernel running function doesn't have any result. clEnqueueNDRangeKernel.
Does anyone know why? By the way, this question is the 2000 question of opencl:)
Here is the whole code:
__kernel void IPM(__global const unsigned char* image_ROI_data, __global unsigned char* IPM_data, __global float* parameter_IPM)
{
float camera_col=parameter_IPM[1];
float camera_row=parameter_IPM[0];
float camera_height=parameter_IPM[2];
float camera_alpha=parameter_IPM[3];
float camera_theta=parameter_IPM[4];
float image_vp=parameter_IPM[5];
float IPM_width=parameter_IPM[6];
float IPM_height=parameter_IPM[7];
int IPM_lineByte=(((int)IPM_width+3)/4)*4;
int image_lineByte=(((int)camera_col+3)/4)*4;
int i= get_global_id(0);
int j= get_global_id(1);
*(IPM_data+((int)IPM_height-j)*IPM_lineByte+i)=0;
float multiple=(float)(IPM_width/20);
// Real x and Real y(they are both meters)
float x=(float)(i-IPM_width/2)/multiple;
float y=(float)(j)/multiple;
// The coordinator in capture image.
float u=(camera_row-1)*(atan(camera_height/sqrt(x*x+y*y))+camera_alpha-camera_theta)/(2*camera_alpha);
float v=(camera_col-1)*(atan(x/y)+camera_alpha)/(2*camera_alpha);
// If the point was in capture image, choose its pixel and fill the image.
// As it is only a ROI so it is u-image_vp
if (((int)u-(int)image_vp)>0 && (int)u<(int)camera_row && v>0 && v<camera_col)
{
*(IPM_data+((int)IPM_height-j)*IPM_lineByte+i)=
*(image_ROI_data+((int)u-(int)image_vp)* image_lineByte+(int)v);
}
}
Upvotes: 1
Views: 715
Reputation: 11920
int i= get_global_id(0); // starts from zero
int j= get_global_id(1); // this too
float x=(float)(i-IPM_width/2) // maybe zero maybe not
float y=(float)(j)/multiple; // becomes zero
float v=(camera_col-1)*(atan(x/y)+camera_alpha)/(2*camera_alpha);
^
|
|
/ \
division by zero
Becomes NaN or INF, and the rest follow.
Then you get a wrong result originating from this.
Especially when you use it for pointer calculus:
*(IPM_data+((int)IPM_height-j)*IPM_lineByte+i)=
*(image_ROI_data+((int)u-(int)image_vp)* image_lineByte+(int)v);
^
|
|
/-\
gg if "if" body is entered
Upvotes: 4
Reputation: 1814
Your Device support only embedded OpenCL profile, which is a subset of full profile, supported by your PC. Generally, you need to re-factor your code to make it embedded-profile compatible.
Upvotes: 3