Reputation: 6753
In the below code, I first bind the texture called ref to an array called gpu in the global memory. Then I call a function called getVal in which i first set the value of gpu[1] to 5 and then read it using the binded texture using tex1Dfetch(ref,1
). However, in this case, tex1Dfetch() does not display the changed value of gpu[5], instead it displays the old value.
Then, I call another function called getagain which just reads tex1Dfetch(ref,1) again. However, this time i get the new value . I really do not understand that why in the first function I do not get the changed value.
#include<cuda_runtime.h>
#include<cuda.h>
#include<stdio.h>
texture<int> ref;
__global__ void getVal(int *c, int *gpu){
gpu[1] = 5;
*c = tex1Dfetch(ref, 1); // returns old value, not 5
}
__global__ void getagain(int *c){
*c = tex1Dfetch(ref, 1); // returns new value !!!????
}
void main(){
int *gpu,*c;
int i,b[10];
for( i =0 ; i < 10; i++){
b[i] = i*3;
}
cudaMalloc((void**)&gpu, sizeof(int) * 10);
cudaBindTexture(NULL, ref, gpu,10*sizeof(int));
cudaMemcpy(gpu, b, 10 * sizeof(int), cudaMemcpyHostToDevice);
cudaMalloc((void**)&c, sizeof(int));
//try changing value and reading using tex1dfetch
getVal<<<1,1>>>(c,gpu);
cudaMemcpy(&i, c,sizeof(int), cudaMemcpyDeviceToHost);
printf("the value returned by tex fetch is %d\n" , i);
cudaMemcpy(b, gpu,10*sizeof(int), cudaMemcpyDeviceToHost);
for( i =0 ; i < 10; i++){
printf("%d\n",b[i]);
}
getagain<<<1,1>>>(c);
cudaMemcpy(&i, c,sizeof(int), cudaMemcpyDeviceToHost);
printf("the value returned by tex fetch is %d\n" , i);
getchar();
}
Upvotes: 0
Views: 1219
Reputation: 26105
Within the same kernel call, the texture cache does not maintain coherency with global memory. See section 3.2.10.4 of the CUDA 4.0 C Programming Guide. Coherency of the texture cache between consecutive kernel calls is achieved by the driver flushing the texture cache prior to launching a kernel.
Upvotes: 3