» About     » Archive     » Submit     » Authors     » Search     » Random     » Specials     » Statistics     » Forum     » Facebook     » RSS Feed     Updates Daily

No. 3350: Convolution Field

First | Previous | 2018-07-21 | Next | Latest

Convolution Field

First | Previous | 2018-07-21 | Next | Latest

Permanent URL: https://mezzacotta.net/garfield/?comic=3350

Strip by: David Morgan-Mar

{a colour image with dark horizontal stripes in the middle, and lighter horizontal and vertical stripes, running through patches of red, blue, green, and a hint of Garfield orange, like some sort of strange mathematically generated tartan}

The author writes:

Alien@System has kindly shown us the mathematical delights of cross-correlation on several occasions (SRoMG #2731, #2853, #2901). However, he has only hinted, in the notes for #2853, on the possibilities of using the related operation of convolution.

Convolution is, for many people, an easier operation to grasp in an intuitive sense than correlation. I use both operations frequently in my day job, as they arise in many applications in physics, particularly in optics and image processing.

A simple way to think about convolution is to consider the formation of an image by a camera lens. Normally, the goal of taking a photo is to focus the lens so that the object you're interested in is in focus, that is, it appears nice and sharp and clear. However, for various reasons, some parts of the image (or even all of the image) may not appear in focus, and instead appear blurry and fuzzy.

To get a feeling for how this comes about, imagine taking a photo of a tiny point of light - say a bright light coming through a tiny pinhole in a black material. If you focus the light accurately using a perfect lens, all of the light from the pinhole will fall onto a single pixel of a typical electronic imaging detector.

Now if you take that lens and camera and point them at a scene the same distance away as the pinhole was, everything should be in ultra-sharp focus. Every point in the image is reflecting (or emitting) light towards the camera. If you consider any given specific point in the scene, all the light from that one point is converging from the lens onto a single pixel of the sensor. In other words, there is no blurring out of light from one point of the image across multiple sensor pixels.

Now let's defocus the lens a bit, perhaps by turning the focus dial. Going back to our single pinhole of light, when we defocus the lens, the light no longer converges into a single pixel. It spreads out into a tiny circle that may cover several pixels on our sensor. This circle contains the same amount of light, but spread over a larger area, so it looks dimmer at any given pixel. This circle is known as the point spread function of the lens (and the linked Wikipedia page has some good diagrams illustrating the concept). The point spread function doesn't actually have to be a circle - if the lens has imperfections or is dusty or whatever, the point spread function may be elliptical, or irregular in shape.

Now think about what happens if we move the camera to point at a scene. Every given point in the scene was previously focused onto a single pixel of our sensor, but now any given point is spread out in the same way as the pinhole, by the point spread function. Everywhere in the image is now fuzzy and blurry - and furthermore, all of these blur circles, or point spread functions, overlap one another. The light from any given point in the scene is spread out faintly into a bunch of surrounding pixels, but so is the light from other nearby points. So each pixel in the camera sensor ends up seeing the combination of light contributed by a group of pixels surrounding it, of the same area as the point spread function. In this way, the scene is now rendered onto the sensor as a blurry, out-of-focus image.

The process of adding up all of the contributions of light from nearby points in the scene, according to the point spread function, as if the point spread functions is being applied repeatedly to all possible different points in the scene, is mathematically what we call a convolution. Specifically, it is the convolution of the scene and the point spread function.

Much like multiplication, convolution is an operation that requires two inputs, to produce a single output. In this case, the inputs are the scene, and the point spread function, and the output is the out-of-focus image. Also like multiplication, convolution is commutative, meaning that it doesn't matter which input is the "scene" and which is the "point spread function" - it's actually completely symmetrical. You could imagine a "scene" consisting of a single blurry circle, and a "point spread function" that looks like a real world scene, and the resulting convolution would be exactly the same.

So, what have I done here? Well, I simply took one Garfield strip and convolved it with another Garfield strip! You can think of the first strip as being the "scene" and the second the "point spread function". Or vice versa - it doesn't matter which way around you take them.

Actually, there's a bit of trickery involved to get this colour image. When it comes to visible light images, convolution applies as a function of wavelength. In other words, there is a "blue" point spread function, a "red" point spread function, a "green" one, and in fact potentially a different point spread function for every wavelength of light. For a well-designed lens, all of the colours will have very similar point spread functions, but in some lenses you can actually see the difference in blurriness of different colours - this is a form of chromatic aberration, which photographers will be familiar with as those undesirable coloured fringes around the edges of objects, often seen in the corners of photos.

Anyway, since Garfield strips are saved as RGB images, what I did was separate out the red, green, and blue channels of the two input strips, convolve them separately to produce separate red, green, and blue convolutions respectively (shown below), and then recompose them as RGB channels to make the resulting colour image.

Red convolution:

Green convolution:

Blue convolution:

I wish there was something particularly interesting and enlightening to say about the result of all this, and the insights it provides into the construction of Garfield comics and the mind of Jim Davis, but the best I can do is point out the structure caused by the three panel format and the borders of the panels, which produce the tartan-like stripes. Any other pertinent information to be extracted is left as an exercise for the reader.

[[Original strips: 2007-08-16, 2008-08-25.]]

Original strips: 2007-08-16, 2008-08-25.