28. Stereo imaging

We see the world in 3D. What this means is that our visual system—comprising the eyes, optic nerves, and brain—takes the sensory input from our eyes and interprets it in a way that gives us the sensation that we exist in a three-dimensional world. This sensation is often called depth perception.

In practical terms, depth perception means that we are good at estimating the relative distances from ourselves to objects in our field of vision. You can tell if an object is nearer or further away just by looking at it (weird cases like optical illusions aside). A corollary of this is that you can tell the three-dimensional shape of an object just by looking at it (again, optical illusions aside). A basketball looks like a sphere, not like a flat circle. You can tell if a surface that you see is curved or flat.

To do this, our brain relies on various elements of visual data known as depth cues. The best known depth cue is stereopsis, which is interpreting the different views of left and right eyes caused by the parallax effect, your brain innately using this information to triangulate distances. You can easily observe parallax by looking at something in the distance, holding up a finger at arms length, and alternately closing left and right eyes. In the view from each eye, your finger appears to move left/right relative to the background. And with both eyes open, if you focus on the background, you see two images of your finger. This tells your brain that your finger is much closer than the background.

Parallax effect

Illustration of parallax. The dog is closer than the background scene. Sightlines from your left and right eyes passing through the dog project to different areas of the background. So the view seen by your left and right eyes show the dog in different positions relative to the background. (The effect is exaggerated here for clarity.)

We’ll discuss stereopsis in more detail below, but first it’s interesting to know that stereopsis is not the only depth cue our brains use. There are many physically different depth cues, and most of them work even with a single eye.

Cover one eye and look at the objects nearby, such as on your desk. Reach out and try to touch them gently with a fingertip, as a test for how well you can judge their depth. For objects within an easy hand’s reach you can probably do pretty well; for objects you need to stretch to touch you might do a little worse, but possibly not as bad as you thought you might. The one eye that you have looking at nearby things needs to adjust the focus of its lens in order to keep the image focused on the retina. Muscles in your eye squeeze the lens to change its shape, thus adjusting the focus. Nerves send these muscle signals to your brain, which subconsciously uses them to help gauge distance to the object. This depth cue is known as accommodation, and is most accurate within a metre or two, because it is within this range that the greatest lens adjustments need to be made.

With one eye covered, look at objects further away, such as across the room. You can tell that some objects are closer and other objects further away (although you may have trouble judging the distances as accurately as if you used both eyes). Various cues are used to do this, including:

Perspective: Many objects in our lives have straight edges, and we use the convergence of straight lines in visual perspective to help judge distances.

Relative sizes: Objects that look smaller are (usually) further away. This is more reliable if we know from experience that certain objects are the same size in reality.

Occultation: Objects partially hidden behind other objects are further away. It seems obvious, but it’s certainly a cue that our brain uses to decide which object is nearer and which further away.

Texture: The texture on an object is more easily discernible when it is nearer.

Light and shadow: The interplay of light direction and the shading of surfaces provides cues. A featureless sphere such as a cue ball still looks like a sphere rather than a flat disc because of the gradual change in shading across the surface.

Shaded circle

A circle shaded to present the illusion that it is a sphere, using light and shadow as depth cues. If you squint your eyes so your screen becomes a bit fuzzy, the illusion of three dimensionality can become even stronger.

Motion parallax: With one eye covered, look at an object 2 or 3 metres away. You have some perception of its distance and shape from the above-mentioned cues, but not as much as if both your eyes were open. Now move your head from side to side. The addition of motion produces parallax effects as your eye moves and your brain integrates that information into its mental model of what you are seeing, which improves the depth perception. Pigeons, chickens, and some other birds have limited binocular vision due to their eyes being on the sides of their heads, and they use motion parallax to judge distances, which is why they bob their heads around so much.

Motion parallax animation

Demonstration of motion parallax. You get a strong sense of depth in this animation, even though it is presented on your flat computer screen. (Creative Commons Attribution 3.0 Unported image by Nathaniel Domek, from Wikimedia Commons.)

There are some other depth cues that work with a single eye as well – I don’t want to try to be exhaustive here.

If you uncover both eyes and look at the world around you, your sense of three dimensionality becomes stronger. Now instead of needing motion parallax, you get parallax effects simply by looking with two eyes in different positions. Stereopsis is one of the most powerful depth cues we have, and it can often be used to override or trick the other cues, giving us a sense of three-dimensionality where none exists. This is the principle behind 3D movies, as well as 3D images printed on flat paper or displayed on a flat screen. The trick is to have one eye see one image, and the other eye see a slightly different image of the same scene, from an appropriate parallax viewpoint.

In modern 3D movies this is accomplished by projecting two images onto the screen simultaneously through two different polarising filters, with the planes of polarisation oriented at 90° to one another. The glasses we wear contain matched polarising filters: the left eye filter blocks the right eye projection while letting the left eye projection through, and vice versa for the right eye. The result is that we see two different images, one with each eye, and our brains combine them to produce the sensation of depth.

Another important binocular depth cue is convergence. To look at an object nearby, your eyes have to point inwards so they are both focused on the same point. For an object further away, your eyes look more parallel. Like your lenses, the muscles that control this send signals to your brain, which it interprets as a distance measure. Convergence can be a problem with 3D movies and images if the image creator is not careful. Although stereopsis can provide the illusion of depth, if it’s not also matched with convergence then there can be conflicting depth cues to your brain. Another factor is that accommodation tells you that all objects are at the distance of the display screen. The resulting disconnects between depth cues are what makes some people feel nauseated or headachy when viewing 3D images.

To create 3D images using stereopsis, you need to have two images of the same scene, as seen from different positions. One method is to have two cameras side by side. This can be used for video too, and is the method used for live 3D broadcasts, such as sports. Interestingly, however, this is not the most common method of making 3D movies.

Coronet 3D camera

A 3D camera produced by the Coronet Camera Company. Note the two lenses at the front, separated by roughly the same spacing as human eyes. (Creative Commons Attribution 3.0 Unported image by Wikimedia Commons user Bilby, from Wikimedia Commons.)

3D movies are generally shot with a single camera, and then an artificial second image is made for each frame during the post-production phase. This is done by a skilled 3D artist, using software to model the depths to various objects in each shot, and then manipulate the pixels of the image by shifting them left or right by different amounts, and painting in any areas where pixel shifts leave blank pixels behind. The reason it’s done this way is that this gives the artist control over how extreme the stereo depth effect is, and this can be manipulated to make objects appear closer or further away than they were during shooting. It’s also necessary to match depth disparities of salient objects between scenes on either side of a scene cut, to avoid the jarring effect of the main character or other objects suddenly popping backwards and forwards across scene cuts. Finally, the depth disparity pixel shifts required for cinema projection are different to the ones required for home video on a TV screen, because of the different viewing geometries. So a high quality 3D Blu-ray of a movie will have different depth disparities to the cinematic release. Essentially, construction of the “second eye” image is a complex artistic and technical consideration of modern film making, which cannot simply be left to chance by shooting with two cameras at once. See “Nonlinear disparity mapping for stereoscopic 3D” by Lang et al.[1], for example, which discusses these issues in detail.

For a still photo however, shooting with two cameras at the same time is the best method. And for scientific shape measurement using stereographic imaging, two cameras taking real images is necessary. One application of this is satellite terrain mapping.

The French space agency CNES launched the SPOT 1 satellite in 1986 into a sun-synchronous polar orbit, meaning it orbits around the poles and maintains a constant angle to the sun, as the Earth rotates beneath it. This brought any point on the surface into the imaging field below the satellite every 26 days. SPOT 1 took multiple photos of areas of Earth in different orbital passes, from different locations in space. These images could then be analysed to match features and triangulate the distances to points on the terrain, essentially forming a stereoscopic image of the Earth’s surface. This reveals the height of topographic features: hills, mountains, and so on. SPOT 1 was the first satellite to produce directly imaged stereo altitude data for the Earth. It was later joined and replaced by SPOT 2 through 7, as well as similar imaging satellites launched by other countries.

Diagram of satellite stereo imaging

Diagram illustrating the principle of satellite stereo terrain mapping. As the satellite orbits Earth, it takes photos of the same region of Earth from different positions. These are then triangulated to give altitude data for the terrain. (Background image is a public domain photo from the International Space Station by NASA. Satellite diagram is a public domain image of GOES-8 satellite by U.S. National Oceanic and Atmospheric Administration.)

Now, if we’re taking photos of the Earth and using them to calculate altitude data, how important is the fact that the Earth is spherical? If you look at a small area, say a few city blocks, the curvature of the Earth is not readily apparent and you can treat the underlying terrain as flat, with modifications by strictly local topography, without significant error. But as you image larger areas, getting up to hundreds of kilometres, the 3D shape revealed by the stereo imaging consists of the local topography superimposed on a spherical surface, not on a flat plane. If you don’t account for the spherical baseline, you end up with progressively larger altitude errors as your imaged area increases.

A research paper on the mathematics of registering stereo satellite images to obtain altitude data includes the following passage[2]:

Correction of Earth Curvature

If the 3D-GK coordinate system X, Y, Z and the local Cartesian coordinate system Xg, Yg, Zg are both set with their origins at the scene centre, the difference in Xg and X or Yg and Y will be negligible, but for Z and Zg [i.e. the height coordinates] the difference will be appreciable as a result of Earth curvature. The height error at a ground point S km away from the origin is given by the well-known expression:

ΔZ = Y2/2R km

Where R = 6367 km. This effect amounts to 67 m in the margin of the SPOT scene used for the reported experiments.

The size of the test scene was 50×60 km, and at this scale you get altitude errors of up to 67 metres if you assume the Earth is flat, which is a large error!

Another paper compares the mathematical solution of stereo satellite altitude data to that of aerial photography (from a plane)[3]:

Some of the approximations used for handling usual aerial photos are not acceptable for space images. The mathematical model is based on an orthogonal coordinate system and perspective image geometry. […] In the case of direct use of the national net coordinates, the effect of the earth curvature is respected by a correction of the image coordinates and the effect of the map projection is neglected. This will lead to [unacceptable] remaining errors for space images. […] The influence of the earth curvature correction is negligible for aerial photos because of the smaller flying height Zf. For a [satellite] flying height of 300 km we do have a scale error of the ground height of 1:20 or 5%.

So the terrain mappers using stereo satellite data need to be aware of and correct for the curvature of the Earth to get their data to come out accurately.

Terrain mapping is done on relatively small patches of Earth. But we’ve already seen in our first proof photos of Earth taken from far enough away that you can see (one side of) the whole planet, such as the Blue Marble photo. Can we do one better, and look at two photos of the Earth taken from different positions at the same time? Yes, we can!

The U.S. National Oceanic and Atmospheric Administration operates the Geostationary Operational Environmental Satellite (GOES) system, manufactured and launched by NASA. Since 1975, NASA has launched 17 GOES satellites, the last four of which are currently operational as Earth observation platforms. The GOES satellites are in geostationary orbit 35790 km above the equator, positioned over the Americas. GOES-16 is also known as GOES-East, providing coverage of the eastern USA, while GOES-17 is known as GOES-West, providing coverage of the western USA. This means that these two satellites can take images of Earth at the same time from two slightly different positions (“slightly” here means a few thousand kilometres).

This means we can get stereo views of the whole Earth. We could in principle use this to calculate the shape of the Earth by triangulation using some mathematics, but there’s an even cooler thing we can do. If we view a GOES-16 image with our right eye, while viewing a GOES-17 image taken at the same time with our left eye, we can get a 3D view of the Earth from space. Let’s try it!

The following images show cross-eyed and parallel viewing pairs for GOES-16/GOES-17 images. Depending on your ability to deal with these images, you should be able to view a stereo 3D image of Earth. (Cross-eyed stereo viewing seems to be the most popular method on the Internet, but personally I’ve never been able to get it to work for me, whereas I find the parallel method fairly easy. I find it works best if I put my face very close to the screen to lock onto the initial image fusion, and then slowly pull my head backwards. Another option if you have a VR viewer for your phone, like Google Cardboard, is to load the parallel image onto your phone and view it with your VR viewer.)

GOES stereo image, cross-eyed

Stereo pair images of Earth from NASA’s GOES-16 (left) and GOES-17 (right) satellites taken at the same time on the same date, 1400 UTC, 12 July 2018. This is a cross-eyed viewing pair: to see the 3D image, cross your eyes until three images appear, and focus on the middle image. It will probably be easier if you reduce the size of the image on your screen using your browser’s zoom function. (Public domain image by NASA, from [4].)

GOES stereo image, parallel

The same stereo pair presented with GOES-16 view on the right and GOES-17 on the left. This is a parallel viewing pair: to see the 3D image relax your eyes so the left eye views the left image and the right eye views the right image, until three images appear, and focus on the middle image. It will probably be easier if you reduce the size of the image on your screen using your browser’s zoom function. (Public domain image by NASA, from [4].)

Unfortunately these images are cropped, but if you managed to get the 3D viewing to work, you will have seen that your brain automatically does the distance calculation ting as it would with a real object, and you can see for yourself with your own eyes that the Earth is rounded, not flat.

I’ve saved the best for last. The Japan Meteorological Agency operates the Himawari-8 weather satellite, and the Korea Meteorological Administration operates the GEO-KOMPSAT-2A satellite. Again these are both on geosynchronous orbits above the equator, this time placed so that Himawari-8 has the best view of Japan, while GEO-KOMPSAT-2A has the best view of Korea, situated slightly to the west. And here I found uncropped whole Earth images from these two satellites taken at the same time, presented again as cross-eyed and then parallel viewing pairs:

Himawari-KOMPSAT stereo image, cross-eyed

Stereo pair images of Earth from Japan Meteorological Agency’s Himawari-8 (left) and Korea Meteorological Administration’s GEO-KOMPSAT-2A (right) satellites taken at the same time on the same date, 0310 UTC, 26 January 2019. This is a cross-eyed viewing pair. (Image reproduced and modified from [5].)

Himawari-KOMPSAT stereo image, parallel

The same stereo pair presented with Himawari-8 view on the right and GEO-KOMPSAT-2A on the left. This is a parallel viewing pair. (Image reproduced and modified from [5].)

For those who have trouble with free stereo viewing, I’ve also turned these photos into a red-cyan anaglyphic 3D image, which is viewable with red-cyan 3D glasses (the most common sort of coloured 3D glasses)

Himawari-KOMPSAT stereo image, anaglyph

The same stereo pair rendered as a red-cyan anaglyph. The stereo separation of the viewpoints is rather large, so it may be difficult to see the 3D effect at full size – it should help to reduce the image size using your browser’s zoom function, place your head close to your screen, and gently move side to side until the image fuses, then pull back slowly.

Hopefully you managed to get at least one of these 3D images to work for you (unfortunately some people find viewing stereo 3D images difficult). If you did, well, I don’t need to point out what you saw. The Earth is clearly, as seen with your own eyes, shaped like a sphere, not a flat disc.

References:

[1] Lang, M., Hornung, A., Wang, O., Poulakos, S., Smolic, A., Gross, M. “Nonlinear disparity mapping for stereoscopic 3D”. ACM Transactions on Graphics, 29 (4), p. 75-84. ACM, 2010. http://dx.doi.org/10.1145/1833349.1778812

[2] Hattori, S., Ono, T., Fraser, C., Hasegawa, H. “Orientation of high-resolution satellite images based on affine projection”. International Archives of Photogrammetry and Remote Sensing, 33(B3/1; PART 3) p. 359-366, 2000. https://www.isprs.org/proceedings/Xxxiii/congress/part3/359_XXXIII-part3.pdf

[3] Jacobsen, K. “Geometric aspects of high resolution satellite sensors for mapping”. ASPRS The Imaging & Geospatial Information Society Annual Convention 1997 Seattle. 1100(305), p. 230, 1997. https://www.ipi.uni-hannover.de/uploads/tx_tkpublikationen/jac_97_geom_hrss.pdf

[4] CIMSS Satellite blog, Space Science and Engineering Center, University of Wisconsin-Madison, “Stereoscopic views of Convection using GOES-16 and GOES-17”. 2018-07-12. https://cimss.ssec.wisc.edu/goes/blog/archives/28920 (accessed 2019-09-26).

[5] CIMSS Satellite blog, Space Science and Engineering Center, University of Wisconsin-Madison, “First GEOKOMPSAT-2A imagery (in stereo view with Himawari-8)”. 2019-02-04. https://cimss.ssec.wisc.edu/goes/blog/archives/31559 (accessed 2019-09-26).

Colour naming experiment, part 2

A couple of months ago I wrote about a colour naming experiment that I was planning to perform with the students in the Science Club that I volunteer to teach at a local primary school. You may want to go back and review that post, as today I’m going to talk about the results of the experiment.

I go back to teach the Science Club again next Monday, so it was time to sit down and analyse the results. I went through the answer sheets that the children filled (there were 12 of them, one of the students was sick that day) in and typed the names of each colour from each child into a spreadsheet. I thought it could accumulate the totals and make pie charts for me, but I discovered that I needed to manipulate the data first using a COUNT() function or something. While pondering whether to do this or to export all the data to CSV and write a Python program to do the gruntwork, one of my friends pointed me at this pertinent xkcd comic.

That inspired me to do all the processing in Python, and I discovered to my pleasant surprise that my machine already had the matplotlib library installed, so I could produce pie charts directly from Python. (Without sucking the munged data back into a spreadsheet again to to the graphs as I feared I might have to do.) Anyway, long story short, here are the results (click the image for a huge readable version):

lots of pie charts

[I should point out that of course the colours in this image as displayed on your computer screen are not exactly the same as the colours printed on the paint sample charts that I assembled and gave to the children, because of the vagaries of colour calibration of monitors and the limited colour gamut of the graphic file format. Consider them only an approximation of what the children actually saw.]

That’s a lot to digest. Here are some highlights:

Firstly, here are the colours for which the largest number of people agreed on the name:

most agreed colours

Out of 12 people, three colours had 7 of them agree on what the colour should be called, and one colour had 6 people agree. There was no colour in the entire sample for which a 2/3 majority agreed on the name, let alone anything approaching unanimity. 31 of the 35 colours sampled had less than half the people agree on the name of the colour.

At the other end of the spectrum (ha ha!), here are the colours that had the most different names assigned:

most disagreed colours

Four colours had, in a sample of just 12 people, nine different colour names assigned to them. Three of these colours also had one or two students unable to decide on a name in the time allowed, and they left it blank on the answer sheet.

I should point out that names that were on the answer sheet are written in lower case with an initial capital, while names that the students chose to write-in are written in all-capitals, and “NONE” indicates a student who didn’t give that colour any name. I gave them what I thought was a generous amount of time, but some of the students complained that it was too difficult and obviously struggled to complete the task. I did ask them beforehand if any of them knew they were colourblind, and none of them did. While there are two or three somewhat bizarre names assigned (“brown” for the colour that most kids identified as “lavender” for example), I don’t see any real evidence that any of them are indeed colourblind (confusing reds and greens, for example).

Another thing you’ll notice if you examine the large image of all the pie charts is that the same colour word is used for several different colours, many times over. For example, “olive” is used to describe three different shades of green, as is “tree green”, while “carrot” is used to describe three different shades of orange, “turquoise” is used for three different shades of blue, and so on.

The conclusion from all of this? This basically confirms the research findings that I quoted in the first post on this experiment – that people are incredibly inconsistent when it comes to naming colours. If you say “olive”, or “carrot”, or “turquoise”, people have a reasonable general idea what sort of colour you mean, but many will not be thinking of the same shade of colour that you will, and will fail to pick it out of a line-up.

The second part of the experiment – showing that people are inconsistent with themselves would require me to ask the children to do this entire task a second time. I was planning on doing this, but given how much some of them complained about it the first time, I think I’ll spare them doing it again, and do something a bit more fun with them instead. Hopefully however, when I show them the results on Monday they’ll think it’s pretty amazing and cool, like I do.

Colour naming experiment

Firstly, sorry for the delay in getting a new proof written. I’ve been travelling, and then got sick on the flight home and was mostly incapacitated for two weeks. And I have deadlines for other stuff that then got in the way.

But one of those deadlines also involves science, and it’s pretty cool so I thought I’d share it with you. I do volunteer work with CSIRO’s STEM Professional in Schools program. As a professional scientist, I am partnered with a primary school and visit the school several times a year to talk to and engage the students with science topics. In past years I’ve mostly done presentations and Q&A sessions, but this year the school science coordinator suggested running a science club with some of the keenest science students from each year.

My Science Club is made of 13 students from years 2 to 6 (so ages 7 to 11). I’m running several experiments with them throughout the year. One of them is actually Eratosthenes’ method of measuring the size of the Earth, modified slightly. I’m getting the kids to measure the length of a vertical stick’s shadow every day at noon. At the end of the year I’ll help them plot the length versus day of the year, and we’ll fit a sine curve and extract the parameters to let us calculate the size of the Earth.

This Monday, I have another Science Club meeting, and I’ve been preparing a different experiment, on colour perception and naming. This is a cool topic that I’ve been interested in ever since I attended an imaging conference and saw some talks about the psychophysics and cultural psychology of colour perception. What I’ve done is to visit a local hardware store and raid their set of house paint sample brochures. Then I cut them up:

Cutting up paint brochures

I had way too many colours, many of which were very similar to others, so I selected a representative subset to try and span as much of the colour space as I could. Then I arranged them and used double sided tape to stick them into manila folders:

Sticking samples into folders

A couple of hours later, and I had 13 folders with identically laid out colour swatches inside:

Colour swatch folders

I used a marker to label all the swatches in each folder with a number. There are 35 swatches:

The 35 colour swatches

Now, here’s the experiment: On Monday in Science Club I’ll give each of the students one of the folders. I’ll also give them a potential list of colour names, with over 100 possible colour names on it:

List of colour names

Their task is to look at each colour, decide which name is the best name for it, and write the colour number on the sheet next to that name. Repeat for all 35 colours. So a lot of the names are going to be left unused. And I’ve included a few write-in slots for any cases where a student is positive that a certain colour really must be called “nasty bruise” or whatever. I’ve been careful to pick names that young children can relate to, and avoid weird things like “heliotrope” and “malachite” that they’ve probably never heard.

The science behind this experiment is that we’re all pretty good and consistent at naming very basic colours like red, and yellow, and blue, but when it comes to naming more subtle shades we are actually highly inconsistent. Is that particular shade of red: rose red, or raspberry, or cherry, or something else? Ask a lot of people and you’ll get a lot of different answers. There are classic studies showing this. (And yes, Randall Munroe of xkcd did a similar thing online a while back and published the results.)

There’s also a study showing that people are inconsistent with themselves, if given exactly the same task a few weeks later. Nearly everyone changes their mind on what certain shades should be called. So this is my experiment with my Science Club! I’m not going to tell the kids that we’ll be repeating this task later in the year. It’ll be interesting to see how closely they can reproduce their own results then, and also how closely their answers align with one another.

Basically, I’m doing something that happens with all good science. I’m replicating an experiment to see if I can reproduce the results. And now that I have this experiment ready to go, I’ll get on to writing up another proof that the Earth is a globe… hopefully within the next few days.