From a remote village

In the last post I wrote a convolution kernel for Core Image. But these kind of things need to be tested. Now, we could package the filter inside an Image Unit (which I'll do in a future post) but that would mean we would have to recompile and install the filter each time we wanted to make a change. It would be much better if there was an interactive environment that could be used to test any changes and to allow us to debug the filter code.

Luckily, Apple has provided just such an environment in QuartzComposer. If you are not familiar, Quartz Composer is an amazing application that provides a visual programming environment that allows the creation of all kinds of visualizations. If you've seen the RSS screensaver. then you've seen a QuartzComposer composition. If you've seen a preview of Time Machine, the swirling galaxy in the background is a QuartzComposer composition. I'm not going to explore this amazing piece of software in depth here, I suggest you look at some of these websites for more information: Quartz Composer Journal or www.quartzcompositions.com or boinx or Quartonian.

As a basic introduction, a composition is defined by dragging patches onto the workspace and connecting them together graphically. Patches contain ports which represent parameters which are passed between the patches. The results are displayed in real time in the viewer window, so you get immediate feedback as you change things. As we'll see later, Quartz composer compositions can be embedded in a QCView in your Cocoa apps, and controlled via Cocoa Bindings.

I've provided a link to my test.qtz composition which consists of four patches. The convolution patch is built by dragging a core image kernel patch onto the editor and then copying the text of the kernel into the patch. You can use the inspector to change the input parameters to adjust the coefficients and test out the composition.

This is the editor window, showing the composition.

Here's the inspector for the Convolution patch showing where you put the kernel code.

This is the viewer, showing the input image filtered by the edge detection filter defined in the Input Parameters to the Convolution filter.

If you have an iSight camera or other video source, replace the Image Importer with a video input, and you will see the convolution filter applied to your video stream in real time. That's just extremely cool.

Here's the editor window with a Video Input instead or the Image Importer.

And here's a view of my living room with edge detection. Notice the frame rate of nearly 60 frames per second on my Mac Book Pro.

Using this basic composition, with some modifications, you should be able to test out any core image kernel you can come up with. Next time, I'll build this filter into an image unit that can be used from any application that uses Core Image filters.

I this post I'm going to implement our convolution filter as a Core Image kernel. Writing a Core Image kernel is relatively straightforward, as I think you'll see. Core Image kernels are written in a subset of the Open GL Shading language, which is basically just C with some added data types a few keywords and function calls. Apple's Core Image Kernel language reference describes the subset that you can use and also the parts of OpenGL shading language that are not implemented. Of note, are the lack of arrays and structures and severe restrictions on looping and conditionals. The vector types like vec3 and vec4 which provide vector types (very convenient to hold the R,G,B,alpha of a pixel) and the sampler which allows you to sample the image

A Core Image kernel has but a single pixel as its output, and is therefore applied once for each pixel in the output. So, you can sample any input pixels from as many input images as you want to generate your output pixel. The filter has to be expressed as a mapping from any set of input pixels to each single output pixel. In our case, this is not a problem. 3x3 Convolution is a pretty natural fit for Core Image, since we only have to sample the 9 pixels immediately surrounding any output pixel. So the first step in our code is to declare the header of the kernel:

kernel vec4 Convolution3by3(
sampler src,
float r00, float r01, float r02,
float r10, float r11, float r12,
float r20, float r21, float r22)
{
vec2 loc;
vec4 result = vec4(0,0,0,1);
//0,0 in my mind is left and up.

I declare a kernel called Convolution3by3. The kernel takes a src argument that represents the source image and 9 floats, which represent the 9 coefficients of the convolution. You can see that lack of support for arrays would make a 5x5 or 7x7 convolution quite cumbersome with this system. We also declare a loc variable to hold our current location and a vec4 for the result.

To perform the convolution, we need to sample the pixels, multiply them by the correct coefficient, and add that value to the result. We've made the conscious decision to maintain the alpha (transparency) value of the result pixel as the alpha value of the center input pixel. Here's the code for the first operation:

loc = vec2(-1.0,1.0);
vec4 p00 = unpremultiply(
sample(src,(samplerCoord(src) + loc) ));
result.rgb = p00.rgb * r00;

What's going on here? First, we call samplerCoord() to get the location that the current output pixel represents. Adding the loc to it allows us to grab the correct pixel in the matrix for this coeficient. Next, we call sample()to get the actual value of the pixel at that location. Core Image gives us pixel information with premultiplied alpha, which means that any transparency has been already multiplied through any RGB values for the pixel. This is a useful optimization since it makes compositing simpler. Since we are going to be using the alpha of the center pixel as the output of the result, we need to reverse this to correctly calculate the convolution, thus the call to unpremultiply(). Finally. we multiply the RGB values of the pixel by the coefficient for that pixel and add it to the result. This process is repeated for each sampled location.

loc = vec2(0.0,1.0);
vec4 p01 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p01.rgb * r01;

loc = vec2(1.0,1.0);
vec4 p02 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p02.rgb * r02;

loc = vec2(-1.0,0.0);
vec4 p10 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p10.rgb * r10;

vec4 p11 = unpremultiply(
sample(src, (samplerCoord(src)) ));
result.rgb += p11.rgb * r11;
result.a = p11.a;

Notice here that I copy the alpha from the input pixel to the result.

loc = vec2(1.0,0.0);
vec4 p12 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p12.rgb * r12;

loc = vec2(-1.0,-1.0);
vec4 p20 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p20.rgb * r20;

loc = vec2(0.0,-1.0);
vec4 p21 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p21.rgb * r21;

loc = vec2(1.0,-1.0);
vec4 p22 = unpremultiply(
sample(src, (samplerCoord(src) + loc) ));
result.rgb += p22.rgb * r22;

result = premultiply( result );
return result;
}

Finally, I premultiply() the result with the alpha value and return the result. As you can see, this is a pretty straightforward procedure: grab the values for each input pixel in the matrix, multiply them by their respective coefficients, accumulate the results and return.

If you want to download a copy of this kernel, it's available on my website: Convolution3by3.cikernel In my next post I'll describe how to test the kernel using the QuartzComposer application and also show how to apply this filter to live video as well as static images.

From a remote village

Mar 10, 2007

Cocoa Application with custom Core Image filter 3: Testing the kernel with QuartzComposer

Mar 5, 2007

Cocoa Application with custom Core Image Filter 2: Implementing the convolution kernel