CUDA and its questions
There we were some question raised during Rupesh’s GPU class today.
- what is sequence of actions for ‘D=10’ in
CAUTION: Not sure about device_reference is focus!? L290 device_reference.inl L2+42 reference.h L82 reference.inl at operator= L65
- Why is the
That is how it is designed to be in cuda/GPU.
- Why is
gridDim.y or zis not but
- Does GTX 680 has limit of 2048 threads per thread block?
- Valid limits of kernel launches
There are multiple limits. All must be satisfied.
- The maximum number of threads in the block is limited to 1024. This is the product of whatever your threadblock dimensions are (xyz).
- The maximum x-dimension is 1024. (1024,1,1) is legal. (1025,1,1) is not legal.
- The maximum y-dimension is 1024. (1,1024,1) is legal. (1,1025,1) is not legal.
- The maximum z-dimension is 64. (1,1,64) is legal. (2,2,64) is also legal. (1,1,65) is not legal. Also, threadblock dimensions of 0 in any position are not legal.
Your choice of threadblock dimensions (x,y,z) must satisfy each of the rules 1-4 above.
dim3is actually a struct of