Mathematica Asked by PalvinWang on January 13, 2021
I’m working on writing CUDA code on MMA, because I found that CUDA on MMA can reach the max precision from this webpage.
When I run the code below with float
type, it yields correct result.(Note that width=1024, k=23
)
Needs["CUDALink`"];
KernelCode = "
__global__ void kernel(float* kXList, int width, int k) {
int row = blockIdx.y * blockDim.y + threadIdx.y;
int col = blockIdx.x * blockDim.x + threadIdx.x;
float delta = 2.0f / width;
if ((row < k) && (col < width)) {
kXList[row * width + col] = -1.0f + (col * k + row + 1.0f) * delta / k;
}
}
";
kernel = CUDAFunctionLoad[KernelCode,
"kernel", { {"Float"}, _Integer, _Integer}, {32, 32}];
kX = CUDAMemoryAllocate["Float", k*W];
kernel[buffer, kX, kXtemp, kX1, kX2, W, k];
test = CUDAMemoryGet[kX][[;; k*W]];
CUDAMemoryUnload[kX];
MatrixForm[ArrayReshape[test, {k, W}]]
First column of result:
{
{-0.999915},
{-0.99983},
{-0.999745},
{-0.99966},
{-0.999575},
{-0.99949},
{-0.999406},
{-0.999321},
{-0.999236},
{-0.999151},
{-0.999066},
{-0.998981},
{-0.998896},
{-0.998811},
{-0.998726},
{-0.998641},
{-0.998556},
{-0.998471},
{-0.998387},
{-0.998302},
{-0.998217},
{-0.998132},
{-0.998047}
}
But when I run the code below with Real_t
type like tutorial said, the result is wrong.
Needs["CUDALink`"];
KernelCode = "
__global__ void kernel(Real_t* kXList, int width,int k) {
int row = blockIdx.y * blockDim.y + threadIdx.y;
int col = blockIdx.x * blockDim.x + threadIdx.x;
Real_t delta = 2.0 / width;
if ((row < k) && (col < width)) {
kXList[row * width + col] = -1.0 + (col * k + row + 1.0) * delta / k;
}
}
";
kernel = CUDAFunctionLoad[KernelCode,
"kernel", { {"Float"}, _Integer, _Integer}, {32, 32}];
kX = CUDAMemoryAllocate["Float", k*W];
kernel[buffer, kX, kXtemp, kX1, kX2, W, k];
test = CUDAMemoryGet[kX][[;; k*W]];
CUDAMemoryUnload[kX];
MatrixForm[ArrayReshape[test, {k, W}]]
First column of result:
{
{-1.58819*10^-23},
{-1.58715*10^-23},
{4.17233*10^-8},
{-1.58715*10^-23},
{-1.07374*10^8},
{4.17349*10^-8},
{2.72008*10^23},
{-1.58845*10^-23},
{0.},
{0.},
{-1.58819*10^-23},
{4.17203*10^-8},
{4.17233*10^-8},
{2.71979*10^23},
{-1.07374*10^8},
{-1.58845*10^-23},
{2.72008*10^23},
{-1.07381*10^8},
{0.},
{0.},
{-1.58819*10^-23},
{-1.58812*10^-23},
{4.17233*10^-8}
}
```
Real
type on MMA side should use {_Real,_,Input}
or {_Real,_,Output}
, not Float
.
Correct answer by PalvinWang on January 13, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP