Mar 5, 2007

Cocoa Application with custom Core Image Filter 2: Implementing the convolution kernel

I this post I'm going to implement our convolution filter as a Core Image kernel. Writing a Core Image kernel is relatively straightforward, as I think you'll see. Core Image kernels are written in a subset of the Open GL Shading language, which is basically just C with some added data types a few keywords and function calls. Apple's Core Image Kernel language reference describes the subset that you can use and also the parts of OpenGL shading language that are not implemented. Of note, are the lack of arrays and structures and severe restrictions on looping and conditionals. The vector types like vec3 and vec4 which provide vector types (very convenient to hold the R,G,B,alpha of a pixel) and the sampler which allows you to sample the image


A Core Image kernel has but a single pixel as its output, and is therefore applied once for each pixel in the output. So, you can sample any input pixels from as many input images as you want to generate your output pixel. The filter has to be expressed as a mapping from any set of input pixels to each single output pixel. In our case, this is not a problem. 3x3 Convolution is a pretty natural fit for Core Image, since we only have to sample the 9 pixels immediately surrounding any output pixel. So the first step in our code is to declare the header of the kernel:


kernel vec4 Convolution3by3(
sampler src,
float r00, float r01, float r02,
float r10, float r11, float r12,
float r20, float r21, float r22)
{
vec2 loc;
vec4 result = vec4(0,0,0,1);
//0,0 in my mind is left and up.

I declare a kernel called Convolution3by3. The kernel takes a src argument that represents the source image and 9 floats, which represent the 9 coefficients of the convolution. You can see that lack of support for arrays would make a 5x5 or 7x7 convolution quite cumbersome with this system. We also declare a loc variable to hold our current location and a vec4 for the result.

To perform the convolution, we need to sample the pixels, multiply them by the correct coefficient, and add that value to the result. We've made the conscious decision to maintain the alpha (transparency) value of the result pixel as the alpha value of the center input pixel. Here's the code for the first operation:


loc = vec2(-1.0,1.0);
vec4 p00 = unpremultiply(
sample(src,(samplerCoord(src) + loc) ));
result.rgb = p00.rgb * r00;


What's going on here? First, we call samplerCoord() to get the location that the current output pixel represents. Adding the loc to it allows us to grab the correct pixel in the matrix for this coeficient. Next, we call sample()to get the actual value of the pixel at that location. Core Image gives us pixel information with premultiplied alpha, which means that any transparency has been already multiplied through any RGB values for the pixel. This is a useful optimization since it makes compositing simpler. Since we are going to be using the alpha of the center pixel as the output of the result, we need to reverse this to correctly calculate the convolution, thus the call to unpremultiply(). Finally. we multiply the RGB values of the pixel by the coefficient for that pixel and add it to the result. This process is repeated for each sampled location.



loc = vec2(0.0,1.0);
vec4 p01 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p01.rgb * r01;

loc = vec2(1.0,1.0);
vec4 p02 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p02.rgb * r02;

loc = vec2(-1.0,0.0);
vec4 p10 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p10.rgb * r10;

vec4 p11 = unpremultiply(
sample(src, (samplerCoord(src)) ));
result.rgb += p11.rgb * r11;
result.a = p11.a;


Notice here that I copy the alpha from the input pixel to the result.



loc = vec2(1.0,0.0);
vec4 p12 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p12.rgb * r12;

loc = vec2(-1.0,-1.0);
vec4 p20 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p20.rgb * r20;

loc = vec2(0.0,-1.0);
vec4 p21 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p21.rgb * r21;

loc = vec2(1.0,-1.0);
vec4 p22 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p22.rgb * r22;

result = premultiply( result );
return result;
}

Finally, I premultiply() the result with the alpha value and return the result. As you can see, this is a pretty straightforward procedure: grab the values for each input pixel in the matrix, multiply them by their respective coefficients, accumulate the results and return.

If you want to download a copy of this kernel, it's available on my website: Convolution3by3.cikernel In my next post I'll describe how to test the kernel using the QuartzComposer application and also show how to apply this filter to live video as well as static images.

4 comments:

Anonymous said...

Hello Paul,

i am enjoying your post on image convolotion filter. thanks a lot for writing it.

i tried to download the code and found out that the link is dead. can you please update that link?

link name:"Convolution3by3.cikernel"

thanks


fawzi_masri@yahoo.com

Paul Franceus said...

Hi-

Sorry but my server is down for now. If you cut and paste the code from the post I guarantee that it's exactly the same thing.

Paul

peddamat said...
This comment has been removed by a blog administrator.
Mobin said...

Hi...
can u plz tell me dat hw audio convolution acchieved in iPhone programming, i search a lot but didn't found anything 4 my use...