玫瑰幻想 发表于 2013-11-9 00:32
您这样做是对的。
但是您需要注意2点:
谢谢玫瑰版主!我现在还有一个疑问,就是让两个这种数组进行行—行相乘,我使用上面的方法,以下是我的代码,结果无法正确显示:- Complex * temporary_signal;
- cudaMalloc((void **)&temporary_signal, 256*4096*sizeof(Complex));
- memset(temporary_signal,0,256*4096*sizeof(Complex));
- Complex * temporary_filter;
- cudaMalloc((void **)&temporary_filter, 256*4096*sizeof(Complex));
- memset(temporary_filter,0,256*4096*sizeof(Complex));
- dim3 dimBlock(256,1);
- dim3 dimGrid((4096+dimBlock.x-1)/dimBlock.x,(32+dimBlock.y-1)/dimBlock.y);
- ComplexPointwiseMulAndScale<<<dimGrid,dimBlock>>>(d_signal,d_filter,temporary_signal,temporary_filter,4096,1.0f/4096);
-
- //频域相乘的内核函数
- static __global__ void ComplexPointwiseMulAndScale(Complex* a,Complex* b,Complex* c,Complex* d, int size, float scale)
- {
- const int tid = blockIdx.x * blockDim.x + threadIdx.x;
- const int bid = blockIdx.y
- const int numThreads=blockDim.x*gridDim.x;
- c[tid]=a[bid*4096+tid];
- d[tid]=b[bid*4096+tid];
-
- for (int g = tid; g < 4096; g +=numThreads)
- {
- c[g] = ComplexScale(ComplexMul(c[g], d[g]), scale);
- }
复制代码 玫瑰版主,是不是问题出在最后的结果赋值上面了?还是其他地方? |