An Introduction to High Dynamic Range Imaging
Adapted from The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG Artists (Rocky Nook)
By Christian Bloch
Computers are very different from our good buddy film. Computers are more like the IRS. All that matters are numbers, they operate based on an outdated book of rules, and if you make a mistake, they respond with cryptic messages. Everything you tell them has to be broken down into bite-sized pieces. That's what happens to images: pixel by pixel they get digitized into bits.
Let's talk about bits: A bit is the basic information unit that every digital data is made of. It resembles a very simple switch that can be either on or off, black or white. There is no grey area in a bit’s world. If we want some finer differentiation, we need to use more bits.
And this is how it works: With a single bit, we can count from 0 to 1. To count further, we have to add another bit, which can be either on or off again. That makes four different combinations: 00, 01, 10, 11. In human numbers that would be 0,1,2,3. We have a slight advantage here because we are not running out of symbols that fast. Computers have to start a new bit all the time. For example, it takes a third bit to write eight different numbers: 0, 1, 10, 11, 100, 101, 110, 111. Whereas we would just count 0,1,3,4,5,6,7,8,9. What looks like a 3-bit number to a computer, would still fit within a single digit in our decimal system. However, the basic way of constructing higher counts is not that different at all. When we’ve used up all the symbols (0 through 96, we add a leading digit for the decades and start over with cycling through the sequence of number symbols. It’s just more tedious in binary numbers, because there are only 0 and 1 as symbols. But with every new bit we add, we can count twice as far. Thus, the formula 2(number of bits6 is a shortcut to find out the highest number that amount of bits can represent.
To sum it up: Bits represent the digits of computer numbers. The more we have, the higher the number we can represent and thus the more unique values we have available.
Great, that’s the math. But how about images? Traditionally, 24 bits are used to describe the color of a pixel. This is the general practice, laid out in the sRGB standard. All digital devices, from cameras to monitors and printers, are aware of it and support it. You might be tempted to call this a 24-bit image format—but wait! In reality, it’s just an 8-bit format. Those 24 bits are broken down into three channels, where 8 bits each are used to describe the intensity of the individual color channels red, green, and blue. Eight bits allow 28 = 256 different values. So the darkest color possible is black with the RGB values (0,0,0) and the brightest is (255,255,255). To change the brightness of a pixel, you have to change all three channels simultaneously—otherwise, you also change the color. Effectively, each pixel can have only 256 different brightness levels. This is not even close to what our eye can see.
Ironically, this format is labeled “truecolor”, a name that stems from an age when computers had less processing power than a modern cell phone and 256 values per channel were considered high end when compared to a fixed palette of 32 colors total.
Don‘t get confused with the different counting methods for bits. sRGB is 8 bit, when you look at the individual channels. The other counting method would be adding all channels up, and then you would get 3 * 8 = 24 bit. Some call that cheating, but it makes sense when you want to indicate that an image has more than the three color channels. A common bonus is an alpha channel that describes a transparency mask. The short name for that kind of image would be RGBA, and if you sum up all four channels, you get 32 bits. But it‘s still 8 bits per channel. Still the same old not-so-true color format. To avoid all this confusing bit counting, I will stick to the per-channel notation and call this format 8-bit for the rest of the book, or low dynamic range (LDR).
Also notice, that there is a limitation built right in: upper and lower limits. Nothing can be brighter than 255 white, and nothing can be darker than 0 black. There is just no way; the scale ends here.
Born to be seen through gamma goggles: Two hundred fifty-six levels are not much to begin with. But it gets even weirder: These levels are not evenly distributed. How come? Let’s see what happens to an image from digital camera capture to display on screen. First of all, digital sensors do not share the logarithmic response of film and our eyes. They just count photons, in a straight linear fashion.
This is what a scene looks like in linear space, just as an image sensor would see it: The image looks very dark and high contrast. In this case, it might have a certain appeal, but we can all agree that it looks far from natural. There are large black areas where we expect to see shadow details and several sudden jumps in brightness. It appears like an evening shot seen through sunglasses. But in my memory, this was a broad daylight scene. What happened?
Here are two diagrams to illustrate what our sensor has just captured. They show the identical curve, but with the scene luminance mapped in two different ways: Linear scale is what the physical luminance really is, with a steady increase of light. However, we like to think in terms of exposure, where each increasing EV is doubling the amount of light. This corresponds better to how we perceive light, and so the second diagram is in logarithmic scale. For comparison, the film curve is overlaid in grey here.
What this tells us is that half of the available values are taken by the brightest EV. The remaining half is divided between the second brightest EV and all the lower ones. If you do that repeatedly for a camera that captures 6 EVs in total, and you try to represent them all with a total of 256 values, you end up with only two levels left for the EV containing the darkest shadow details. Most of the image ends up darker than we would see it, crunched together in fewer and fewer values. It’s like cutting half off a birthday cake again and again, until only a tiny crumb is left for the last party guest.
That wouldn’t work well, would it?
To the rescue comes a little trick called gamma encoding. And this is where the trouble starts. Mathematically, the gamma is applied as a power law function, but that shouldn’t bother us right now. What’s important to remember is that gamma encoding has the effect of applying a very steep tone curve.
It looks approximately like this: The gamma curve pulls the tonal values up, strongest in the middle tones and weakest in the highlights. The result is a much more natural-looking distribution of tonal values. Most noticeably, the darker EVs are back—hurray. And the middle EVs stretch almost across the entire tonal range available.
Note that this is not a creative process; it is done in the background without you noticing it. Gamma encoding is a built-in property of all LDR imagery, and it is necessary to make better use of our 256 digital values. It is a hack that distorts their allocation to light intensities, born out of the limitations of the digital Bronze Age.
With gamma encoding, our 8-bit image has been cheated to work nicely on a cathode ray tube (CRT) monitor. I don’t want to bore you with more technical graphs. Let’s just say that they are made for each other. CRT monitors can show all the nuances that 8-bit can deliver, and it’s getting as close to human vision as this technology can get. Traditionally, other output devices are calibrated to this gamma value too. Printers, LCD monitors, television sets—they all expect their input feed to come with a gamma distorted level distribution, and it is their own responsibility to turn that into an output that is pleasing to the eye. That is why we call 8-bit imagery an output-referring standard.
Here comes the catch: As soon as you want to edit a digital image, the pitfalls become quite apparent.
Especially critical are the shadows and middle tones. Brightening an image up will reveal that the gamma encoding has just pretended to preserve detail in those areas, just enough so we don’t see the difference on our monitor. The truth is, the lower EVs did not have many different digital values to begin with. Spreading them out across a larger set of values introduces nasty posterizing effects, most noticeable in the large blue-brown color fills that are supposed to be the shaded side of the mountain. Also, the histogram now looks like it just went through a bar rumble, exposing large gaps all over.
Now this adjustment was done with a very primitive tone curve. Yes, there are more advanced tools, using more sophisticated algorithms to reconstruct the intermediate tonal values. Such specialty tools might be more or less successful in filling the gaps, but in the end all they can do is guesswork. It’s very unlikely that they will discover new details that have not been there before. No new pebble or bush will emerge from the shadows.
The next thing to consider is that what were the upper stops before are now stacked up on the right side of the histogram. These tonal values have snapped together. Once again, there is no way to separate them now. We even pushed a lot of them to the maximum value of 255. Remember, this is a hard limit. Almost the entire sky is just white now, and the clouds are gone. Now and forever. We dumped the data.
Any subsequent change has to deal with an even smaller set of distinguishable values. For demonstration purpose, let’s try to craft an inverse tone curve to bring the image back to what it was before.
OK, looks like our image can only be declared dead. Bye-bye fluffy clouds. What’s left in the sky is just some cyan fade, where the blue channel got clipped off before red and green. The tree looks like an impressionist painting, entirely made of color splotches. And our histogram just screams in pain. Apparently, we could not get the image back to the original that easily.
|Don't miss the next tip on Graphics.com. Get the free Graphics.com newsletter in your mailbox each week. Click here to subscribe.|
Excerpted from The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG Artists by Christian Bloch. Copyright © 2007. Used with permission of Rocky Nook.