The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG Artists

An Introduction to High Dynamic Range Imaging

Adapted from The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG Artists (Rocky Nook)

By Christian Bloch

Computers are very different from our good buddy film. Computers are more like the IRS. All that matters are numbers, they operate based on an outdated book of rules, and if you make a mistake, they respond with cryptic messages. Everything you tell them has to be broken down into bite-sized pieces. That's what happens to images: pixel by pixel they get digitized into bits.

Let's talk about bits: A bit is the basic information unit that every digital data is made of. It resembles a very simple switch that can be either on or off, black or white. There is no grey area in a bit’s world. If we want some finer differentiation, we need to use more bits.

A bit resembles a simple on/off switch.

And this is how it works: With a single bit, we can count from 0 to 1. To count further, we have to add another bit, which can be either on or off again. That makes four different combinations: 00, 01, 10, 11. In human numbers that would be 0,1,2,3. We have a slight advantage here because we are not running out of symbols that fast. Computers have to start a new bit all the time. For example, it takes a third bit to write eight different numbers: 0, 1, 10, 11, 100, 101, 110, 111. Whereas we would just count 0,1,3,4,5,6,7,8,9. What looks like a 3-bit number to a computer, would still fit within a single digit in our decimal system. However, the basic way of constructing higher counts is not that different at all. When we’ve used up all the symbols (0 through 96, we add a leading digit for the decades and start over with cycling through the sequence of number symbols. It’s just more tedious in binary numbers, because there are only 0 and 1 as symbols. But with every new bit we add, we can count twice as far. Thus, the formula 2(number of bits6 is a shortcut to find out the highest number that amount of bits can represent.

To sum it up: Bits represent the digits of computer numbers. The more we have, the higher the number we can represent and thus the more unique values we have available.

Great, that’s the math. But how about images? Traditionally, 24 bits are used to describe the color of a pixel. This is the general practice, laid out in the sRGB standard. All digital devices, from cameras to monitors and printers, are aware of it and support it. You might be tempted to call this a 24-bit image format—but wait! In reality, it’s just an 8-bit format. Those 24 bits are broken down into three channels, where 8 bits each are used to describe the intensity of the individual color channels red, green, and blue. Eight bits allow 28 = 256 different values. So the darkest color possible is black with the RGB values (0,0,0) and the brightest is (255,255,255). To change the brightness of a pixel, you have to change all three channels simultaneously—otherwise, you also change the color. Effectively, each pixel can have only 256 different brightness levels. This is not even close to what our eye can see.

Traditionally, 24 bits are used to describe the color of a pixel.

Ironically, this format is labeled “truecolor”, a name that stems from an age when computers had less processing power than a modern cell phone and 256 values per channel were considered high end when compared to a fixed palette of 32 colors total.

Don‘t get confused with the different counting methods for bits. sRGB is 8 bit, when you look at the individual channels. The other counting method would be adding all channels up, and then you would get 3 * 8 = 24 bit. Some call that cheating, but it makes sense when you want to indicate that an image has more than the three color channels. A common bonus is an alpha channel that describes a transparency mask. The short name for that kind of image would be RGBA, and if you sum up all four channels, you get 32 bits. But it‘s still 8 bits per channel. Still the same old not-so-true color format. To avoid all this confusing bit counting, I will stick to the per-channel notation and call this format 8-bit for the rest of the book, or low dynamic range (LDR).

Also notice, that there is a limitation built right in: upper and lower limits. Nothing can be brighter than 255 white, and nothing can be darker than 0 black. There is just no way; the scale ends here.

Born to be seen through gamma goggles: Two hundred fifty-six levels are not much to begin with. But it gets even weirder: These levels are not evenly distributed. How come? Let’s see what happens to an image from digital camera capture to display on screen. First of all, digital sensors do not share the logarithmic response of film and our eyes. They just count photons, in a straight linear fashion.

This is what a scene looks like in linear space, just as an image sensor would see it.

This is what a scene looks like in linear space, just as an image sensor would see it: The image looks very dark and high contrast. In this case, it might have a certain appeal, but we can all agree that it looks far from natural. There are large black areas where we expect to see shadow details and several sudden jumps in brightness. It appears like an evening shot seen through sunglasses. But in my memory, this was a broad daylight scene. What happened?

Here are two diagrams to illustrate what our sensor has just captured. They show the identical curve, but with the scene luminance mapped in two different ways: Linear scale is what the physical luminance really is, with a steady increase of light. However, we like to think in terms of exposure, where each increasing EV is doubling the amount of light. This corresponds better to how we perceive light, and so the second diagram is in logarithmic scale. For comparison, the film curve is overlaid in grey here.

What this tells us is that half of the available values are taken by the brightest EV. The remaining half is divided between the second brightest EV and all the lower ones. If you do that repeatedly for a camera that captures 6 EVs in total, and you try to represent them all with a total of 256 values, you end up with only two levels left for the EV containing the darkest shadow details. Most of the image ends up darker than we would see it, crunched together in fewer and fewer values. It’s like cutting half off a birthday cake again and again, until only a tiny crumb is left for the last party guest.

That wouldn’t work well, would it?

To the rescue comes a little trick called gamma encoding. And this is where the trouble starts. Mathematically, the gamma is applied as a power law function, but that shouldn’t bother us right now. What’s important to remember is that gamma encoding has the effect of applying a very steep tone curve.

It looks approximately like this: The gamma curve pulls the tonal values up, strongest in the middle tones and weakest in the highlights. The result is a much more natural-looking distribution of tonal values. Most noticeably, the darker EVs are back—hurray. And the middle EVs stretch almost across the entire tonal range available.

Note that this is not a creative process; it is done in the background without you noticing it. Gamma encoding is a built-in property of all LDR imagery, and it is necessary to make better use of our 256 digital values. It is a hack that distorts their allocation to light intensities, born out of the limitations of the digital Bronze Age.

Gamma encoding.

With gamma encoding, our 8-bit image has been cheated to work nicely on a cathode ray tube (CRT) monitor. I don’t want to bore you with more technical graphs. Let’s just say that they are made for each other. CRT monitors can show all the nuances that 8-bit can deliver, and it’s getting as close to human vision as this technology can get. Traditionally, other output devices are calibrated to this gamma value too. Printers, LCD monitors, television sets—they all expect their input feed to come with a gamma distorted level distribution, and it is their own responsibility to turn that into an output that is pleasing to the eye. That is why we call 8-bit imagery an output-referring standard.

Here comes the catch: As soon as you want to edit a digital image, the pitfalls become quite apparent.

Especially critical are the shadows and middle tones. Brightening an image up will reveal that the gamma encoding has just pretended to preserve detail in those areas, just enough so we don’t see the difference on our monitor. The truth is, the lower EVs did not have many different digital values to begin with. Spreading them out across a larger set of values introduces nasty posterizing effects, most noticeable in the large blue-brown color fills that are supposed to be the shaded side of the mountain. Also, the histogram now looks like it just went through a bar rumble, exposing large gaps all over.

Brightening an 8-bit image with a tone curve introduces artifacts.

Now this adjustment was done with a very primitive tone curve. Yes, there are more advanced tools, using more sophisticated algorithms to reconstruct the intermediate tonal values. Such specialty tools might be more or less successful in filling the gaps, but in the end all they can do is guesswork. It’s very unlikely that they will discover new details that have not been there before. No new pebble or bush will emerge from the shadows.

The next thing to consider is that what were the upper stops before are now stacked up on the right side of the histogram. These tonal values have snapped together. Once again, there is no way to separate them now. We even pushed a lot of them to the maximum value of 255. Remember, this is a hard limit. Almost the entire sky is just white now, and the clouds are gone. Now and forever. We dumped the data.

Any subsequent change has to deal with an even smaller set of distinguishable values. For demonstration purpose, let’s try to craft an inverse tone curve to bring the image back to what it was before.

OK, looks like our image can only be declared dead. Bye-bye fluffy clouds. What’s left in the sky is just some cyan fade, where the blue channel got clipped off before red and green. The tree looks like an impressionist painting, entirely made of color splotches. And our histogram just screams in pain. Apparently, we could not get the image back to the original that easily.

Inverse tone curve applied, in an attempt to bring the image back to where it was.

Original image for comparison.

More bits: Yes, I know what you are thinking. It was very unprofessional to make such harsh adjustments in 8-bit mode. If I am that crazy about histograms, I should have used the higher precision that a 16-bit mode off ers. Indeed, here we have more digital values available for each EV, and it helps tremendously when doing such incremental editing steps. Sixteen-bit is actually 15-bit because the first bit is used for something else, but that still leaves us 215 = 32,768 different levels to work with. Wouldn’t that be it?

And I say no. It doesn’t give me back the fluffy clouds. It doesn’t even save the sky from getting burned into cyan. Because 16-bit is just more of the same. It has a maximum and a minimum limit, where color values get rudely chopped off. And even if you don’t make such hardcore adjustments as I just did, you never know where a sharpening filter or some other fine detail processing step will throw individual channels out of range.

Sixteen-bit is an output-referring format as well. It carries the gamma-encoded nonlinear level distribution, so most of the extra precision gets added in the highlights, where we need it the least. And it uses a fixed number of discrete values. There might be more of them, but that still means our levels get chopped up into a fixed number of slices. The more you tweak, the more rounding errors you’ll see and the more unique values you’ll lose.

Nondestructive editing: Here is a working solution: Instead of applying multiple editing steps one after another, we could just stack them up and apply them all at once to the original image. That would take out the rounding errors and preserve the best quality possible.

Photoshop’s adjustment layers follow this idea, and other image editors use similar concepts. The drawback of this approach is that the flexibility ends at the point where you actually commit all your changes. But you have to commit your changes when you take an image through a chain of multiple programs. An example would be when you develop a RAW file in your favorite RAW converter, tweak the levels in LightZone because you just love the zone system, and then do additional tweaks through dodging and burning in Photoshop. You can dodge and burn all day, but you will not get details back that have been cut out by the RAW converter. The farther down the chain, the more of the original data you’ll lose.

Splitting the bit.

Splitting the bit: What we really need to do is reconsider the basic premise that digital imaging is built upon. We need to split the bit. We need to use floating point numbers.

See, the hassle started with our limited set of values, ranging from 0 to 255. Everything would be easier if we would allow fractional numbers. Then we would have access to an infinite amount of in-between values just by placing them after the decimal point. While there is nothing between 25 and 26 in LDR, floating point allows 25.5 as a totally legitimate color value and also 25.2 and 25.3, as well as 25.23652412 if we need it. HDRI is that simple yet that revolutionary!

We need no gamma because we can incrementally get finer levels whenever we need them. Our image data can stay in linear space, aligned to the light intensities of the real world. And most importantly, we get rid of the upper and lower boundaries. If we want to declare a pixel to be 10,000 luminous, we can. No pixel deserves to get thrown out of the value space. They may wander out of our histogram view, but they don’t die. We can bring them back at any time because the scale is open on both ends.

We’re done with 8-bit anyway. Atari, Amiga, and DOS computers were based on 8-bit. Windows and Mac OS are now making the leap from 32- to 64-bit and so are Intel and AMD processors. Only our images have been stuck in the ‘80s.

So, HDR workflow is better than RAW workflow? Well, they are closely related.

RAW has extended a photographer's capabilities to digitally access the untreated sensor data. Most RAW files are linear by nature and look very much like the first example picture with the tree. By processing such a file yourself, you get manual control over how these linear intensities are mapped to the gamma-distorted color space. There are other hardware-dependent oddities to RAW files, like low-level noise and each pixel representing only one of the primary colors so that the full image has to interpolated. Having control over all these factors is an essential ingredient for squeezing out the maximum image quality.

However, with RAW you are walking a one-way road. The processing is usually done first, and any subsequent editing relies on the data that is left. Not so with HDR; it preserves everything. You can edit HDR images, take them from one program to another, edit some more. As long as you stick to it, no data will be lost.

RAW is also bound to specific hardware: the capturing sensor. HDRI is not. HDRI is a much more standardized base that is truly hardware independent. You can generate HDR images that exceed what a sensor can capture by far, and you can use them in a much wider field of applications. In the case of CG renderings, there is no physical sensor that would spit out RAW images. Instead, HDR images take the position of RAW files in the CG world.

Think of it like this: HDR imaging is the next generation of a RAW workflow. Right now, they go hand in hand and extend each other. But sooner or later all digital imaging will happen in HDR.

But what is the immediate advantage? When a RAW image represents what a sensor captures, an HDR image contains the scene itself. It has enough room to preserve all light intensities as they are. You can re-expose this “canned scene” digitally as often as you want. By doing so, you take a snapshot of the HDR and develop an LDR print.

This technique is called tone mapping, and it can be as simple as selecting a focused exposure or as complicated as scrambling up all the tonal levels and emphasizing the details in light and shadow areas. You can even simulate the locale adaptation mechanism that our eye is capable of. Or you can go to the extreme and find new, painterly styles. Tone mapping puts you back in control. You choose exactly how the tonal values are supposed to appear in the final output-referring file. None of this slapon gamma anymore. You decide!

Let me give you a rather extreme example: A shot of an interior scene that includes a window to the outside, which is a classic case for HDR photography. In conventional imaging, you wouldn’t even try this. There is just too much dynamic range within the view; it stretches across 17 EVs. Neither analog nor digital can capture this in one shot.

This scene is a typical kitchen for visual effects artists. It is very important that they see the Hollywood sign while they’re pouring fresh coffee

Kitchen at EdenFX, at three hopeless exposures.

None of the single-shot exposures can capture this scene entirely. But once they are merged into an HDR image, exposure becomes an adjustable value. You can slide it up and down to produce any of the original exposures or set it somewhere in between to generate new exposures.

Or you can selectively re-expose parts of the image. In this case it was quite simple to draw a rectangular mask in Photoshop and pull the exposure down just for the window.

Manually tone-mapped HDR image can show the entire scene.

Impressionist interpretation of the HDR image, emphasizing texture details.

Other methods are more sophisticated and allow you to squash and stretch the available levels to wrangle out the last bit of detail. The second image was treated twice with the Detail Enhancer in Photomatix and selectively blended in Photoshop.

You see, there is an immense creative potential to be explored here. Tone mapping alone opens a new chapter in digital photography. There is really no right or wrong—tone mapping is a creative process, and the result is an artistic expression. Exactly that is the beauty of it. Whether you like it or not is a simple matter of personal taste.

So much for the teaser. The thing to remember is that every output device has a specific tonal range it can handle, which can be fully utilized when the HDRI is broken down. Tone mapping takes an image from a scene-referred to an output-referred state. And how far you have to crunch it down depends on the range limitation of the final output.

Don't miss the next tip on Get the free newsletter in your mailbox each week. Click here to subscribe.

Excerpted from The HDRI Handbook: High Dynamic Range Imaging for Photographers and CG Artists by Christian Bloch. Copyright © 2007. Used with permission of Rocky Nook.