Professional Documents
Culture Documents
*#
Electrical and Computer Engineering Department # Predoctoral Fellows, NLM Training Grant
IEEE Computational Intelligence Society MU Chapter And National Library of Medicine Medical Informatics Training Grant Special Seminar Series
Organization of Lectures
Part I
Introduction to GPUs and shader languages
Part II
Image processing (Morphology, Sobel, and Gaussian)
Part III
Performance, multi-pass rendering, optimizations, and debugging
Part IV
Using GPUs for non-image based processing (SOFM & CA)
Ping Pong
The name says it all
Have two textures of the same dimensionality One fragment program Use texture A as input, B as output Then swap; B as input, A as output
Texture B
Reduction
Use the Ping Pong idea Reduce the size of the texture with each pass Distributes operations over more processors
A+B+C+D E+F+G+H I+J+K+L M+N+O+P
A B C D E I F J G H K L
M N O P
M+N O+P
SOFM
Clustering technique Used in many settings for generating a codebook Have a 1D/2D/3D space of nodes (Usually 2D) Each map node has dimensionality d Input data is of size nXd
1.3 86.5 4.5 23.9 21.8 28.4 9.4 5.39 34.2 29.1 77.7 73.4 6.5 54.9 3.9 9.7 8.7 5.4 53.8 35.5
46.5
63.9
48.4
99.39
89.1
53.4
25.9
98.7
5.8
75.5
SOFM Nodes n
Input Data d
SOFM Cont.
Go through each input vector
Find node with minimum dist to current input vector Move nearest node closer to input vector Also move the nodes around the winning node closer to the input vector by smaller amount
SOFM Nodes
1.3 4.5
SOFM Results
Have a 2D representation of clusters Neighboring nodes are similar
GPU SOFM
Input Data
Written numbers 0-9 5 samples of each Gray Scale image shrunk to 16x16 for each number : d=256 Texture size for data : 50x256
256 50 Input
GPU SOFM
SOFM Nodes
8x8 space Dimensionality of each node 256 Texture size for data : 8x2046
GPU SOFM
Dist to current input vector
Same size as SOFM node texture : 8x2046
2046 8 Input 8
2046
Output
GPU SOFM
Min dist to current vector texture
Dimensionality : 8x8
8 8 Input 8
Output
Summation Distances
Distances
Min Distance
Input Vector
Distances
Min Distance
And So On
Continue looping through these steps as many time as desired. Be sure to toggle which SOFM Node Data texture is being read from and written to.
Increasing Time
http://mathworld.wolfram.com/CellularAutomaton.html
CA on a GPU
Basic idea (for the 1D binary case)
Pack the data into an image (one channel, i.e. Red) Use a FBO for multi-pass rendering (fast!)
Initialization
Set the values in the first row of pixels Turn some on (1=black above) and some off (0=green above)
Row counter
int ca_counter;
Render Code
glPolygonMode(GL_FRONT,GL_FILL); glBegin(GL_QUADS); glTexCoord2i(2,ca_counter-1); glVertex2f(2.0,ca_counter-1); glTexCoord2i(imageWidth-2,ca_counter-1); glVertex2f(imageWidth-2.0,ca_counter-1); glTexCoord2i(imageWidth-2,ca_counter); glVertex2f(imageWidth-2.0,ca_counter); glTexCoord2i(2,ca_counter); glVertex2f(2.0,ca_counter); glEnd();
Fragment Program
void FragmentProgram ( out float4 color0 : COLOR0 , float2 coords : TEXCOORD0 , uniform samplerRECT tex ) { //I use this for the check below half2 tul, tuc, tur, tul2, tur2;
tul = texRECT( tex , newindex + float2(-offset,-offset) ).rg; tuc = texRECT( tex , newindex + float2(0.0,-offset) ).rg; tur = texRECT( tex , newindex + float2(offset,-offset) ).rg;
Fragment Program
if(tuc.r == 1.0){ if( tul.g == 1.0 ){ if( tuc.g == 1.0 ){ if( tur.g == 1.0 ){ //1 1 1 color0 = float4(0.0,0.0,0.0,1.0); }else{ //1 1 0 color0 = float4(0.0,0.0,0.0,1.0); } }else{ if( tur.g == 1.0 ){ //1 0 1 color0 = float4(0.0,1.0,0.0,1.0); }else{ //1 0 0 color0 = float4(0.0,1.0,0.0,1.0); } } }else{ if( tuc.g == 1.0 ){ if( tur.g == 1.0 ){ //0 1 1 color0 = float4(0.0,0.0,0.0,1.0); }else{ //0 1 0 color0 = float4(0.0,0.0,0.0,1.0); } }else{ if( tur.g == 1.0 ){ //0 0 1 color0 = float4(0.0,1.0,0.0,1.0); }else{ //0 0 0 color0 = float4(0.0,1.0,0.0,1.0); } } } }
Fragment Program
else{ if( tul.r == 1.0 ){ tul2 = texRECT( tex , newindex + float2(-2.0*offset,-offset) ).rg; if( (tul2.g == 0.0 && tul.g == 0.0 && tuc.g == 0.0) || (tul2.g == 0.0 && tul.g == 0.0 && tuc.g == 1.0) || (tul2.g == 0.0 && tul.g == 1.0 && tuc.g == 0.0) || (tul2.g == 1.0 && tul.g == 1.0 && tuc.g == 0.0) || (tul2.g == 1.0 && tul.g == 1.0 && tuc.g == 1.0) ){ color0 = float4(1.0,tuc.g,0.0,1.0); }else{ color0 = float4( tuc , 0.0 , 1.0 ); } }else if( tur.r == 1.0 ){ tur2 = texRECT( tex , newindex + float2(2.0*offset,-offset) ).rg; if( (tuc.g == 0.0 && tur.g == 1.0 && tur2.g == 1.0) || (tuc.g == 1.0 && tur.g == 0.0 && tur2.g == 0.0) || (tuc.g == 1.0 && tur.g == 0.0 && tur2.g == 1.0) ){ color0 = float4(1.0,tuc.g,0.0,1.0); }else{ color0 = float4( tuc , 0.0 , 1.0 ); } }else{ color0 = float4( tuc , 0.0 , 1.0 ); } }
Life on a GPU
Simple GPU program! Use FBOs Render each pixel Sample the neighborhood Like CA, make a decision based on the rules of the game