|
Computers are very different from our good
buddy film. Computers are more like the IRS.
All that matters are numbers, they operate
based on an outdated book of rules, and if you
make a mistake, they respond with cryptic
messages. Everything you tell them has to
be broken down into bite-sized pieces. That's
what happens to images: pixel by pixel they
get digitized into bits.
Let's talk about bits: A bit is the basic information
unit that every digital data is made
of. It resembles a very simple switch that can
be either on or off, black or white. There is no
grey area in a bit’s world. If we want some finer
differentiation, we need to use more bits.
 A bit resembles a
simple on/off switch.
And this is how it works: With a single bit,
we can count from 0 to 1. To count further, we
have to add another bit, which can be either
on or off again. That makes four different combinations:
00, 01, 10, 11. In human numbers
that would be 0,1,2,3. We have a slight advantage
here because we are not running out of
symbols that fast. Computers have to start
a new bit all the time. For example, it takes
a third bit to write eight different numbers:
0, 1, 10, 11, 100, 101, 110, 111. Whereas we
would just count 0,1,3,4,5,6,7,8,9. What looks
like a 3-bit number to a computer, would still
fit within a single digit in our decimal system.
However, the basic way of constructing higher
counts is not that different at all. When we’ve
used up all the symbols (0 through 96, we add
a leading digit for the decades and start over
with cycling through the sequence of number
symbols. It’s just more tedious in binary numbers,
because there are only 0 and 1 as symbols.
But with every new bit we add, we can
count twice as far. Thus, the formula 2(number of
bits6 is a shortcut to find out the highest number
that amount of bits can represent.
To sum it up: Bits represent the digits of
computer numbers. The more we have, the
higher the number we can represent and thus
the more unique values we have available.
Great, that’s the math. But how about images?
Traditionally, 24 bits are used to describe
the color of a pixel. This is the general
practice, laid out in the sRGB standard. All
digital devices, from cameras to monitors and
printers, are aware of it and support it. You might be tempted to call this a 24-bit image
format—but wait! In reality, it’s just an 8-bit
format. Those 24 bits are broken down into
three channels, where 8 bits each are used to
describe the intensity of the individual color
channels red, green, and blue. Eight bits allow
28 = 256 different values. So the darkest color
possible is black with the RGB values (0,0,0)
and the brightest is (255,255,255). To change
the brightness of a pixel, you have to change
all three channels simultaneously—otherwise,
you also change the color. Effectively, each
pixel can have only 256 different brightness
levels. This is not even close to what our eye
can see.
 Traditionally, 24 bits
are used to describe
the color of a pixel.
Ironically, this format is labeled “truecolor”,
a name that stems from an age when computers
had less processing power than a modern
cell phone and 256 values per channel were
considered high end when compared to a fixed
palette of 32 colors total.
Don‘t get confused with the different counting
methods for bits. sRGB is 8 bit, when you
look at the individual channels. The other
counting method would be adding all channels
up, and then you would get 3 * 8 = 24 bit.
Some call that cheating, but it makes sense
when you want to indicate that an image has
more than the three color channels. A common
bonus is an alpha channel that describes
a transparency mask. The short name for that
kind of image would be RGBA, and if you sum
up all four channels, you get 32 bits. But it‘s
still 8 bits per channel. Still the same old not-so-true color format. To avoid all this confusing
bit counting, I will stick to the per-channel
notation and call this format 8-bit for the rest
of the book, or low dynamic range (LDR).
Also notice, that there is a limitation built
right in: upper and lower limits. Nothing can
be brighter than 255 white, and nothing can
be darker than 0 black. There is just no way;
the scale ends here.
Born to be seen through gamma goggles:
Two hundred fifty-six levels are not much to
begin with. But it gets even weirder: These levels
are not evenly distributed. How come?
Let’s see what happens to an image from
digital camera capture to display on screen.
First of all, digital sensors do not share the
logarithmic response of film and our eyes.
They just count photons, in a straight linear
fashion.
 This is what a scene looks like in linear space,
just as an image sensor would see it.
This is what a scene looks like in linear space,
just as an image sensor would see it: The image
looks very dark and high contrast. In this
case, it might have a certain appeal, but we
can all agree that it looks far from natural.
There are large black areas where we expect to
see shadow details and several sudden jumps
in brightness. It appears like an evening shot
seen through sunglasses. But in my memory,
this was a broad daylight scene. What happened?
Here are two diagrams to illustrate what
our sensor has just captured. They show the
identical curve, but with the scene luminance
mapped in two different ways: Linear scale
is what the physical luminance really is, with
a steady increase of light. However, we like
to think in terms of exposure, where each
increasing EV is doubling the amount of light. This corresponds better to how we perceive
light, and so the second diagram is in logarithmic
scale. For comparison, the film curve is
overlaid in grey here.
What this tells us is that half of the available
values are taken by the brightest EV. The
remaining half is divided between the second
brightest EV and all the lower ones. If you do
that repeatedly for a camera that captures 6
EVs in total, and you try to represent them
all with a total of 256 values, you end up with
only two levels left for the EV containing the
darkest shadow details. Most of the image
ends up darker than we would see it, crunched
together in fewer and fewer values. It’s like
cutting half off a birthday cake again and
again, until only a tiny crumb is left for the
last party guest.
That wouldn’t work well, would it?
To the rescue comes a little trick called gamma
encoding. And this is where the trouble
starts. Mathematically, the gamma is applied
as a power law function, but that shouldn’t
bother us right now. What’s important to remember
is that gamma encoding has the effect
of applying a very steep tone curve.
It looks
approximately like this: The gamma curve
pulls the tonal values up, strongest in the
middle tones and weakest in the highlights.
The result is a much more natural-looking distribution
of tonal values. Most noticeably, the
darker EVs are back—hurray. And the middle
EVs stretch almost across the entire tonal
range available.
Note that this is not a creative process; it is
done in the background without you noticing
it. Gamma encoding is a built-in property of
all LDR imagery, and it is necessary to make
better use of our 256 digital values. It is a hack
that distorts their allocation to light intensities,
born out of the limitations of the digital
Bronze Age.
 Gamma encoding.
With gamma encoding, our 8-bit image has
been cheated to work nicely on a cathode ray
tube (CRT) monitor. I don’t want to bore you
with more technical graphs. Let’s just say that they are made for each other. CRT monitors
can show all the nuances that 8-bit can deliver,
and it’s getting as close to human vision as
this technology can get. Traditionally, other
output devices are calibrated to this gamma
value too. Printers, LCD monitors, television
sets—they all expect their input feed to come
with a gamma distorted level distribution, and
it is their own responsibility to turn that into
an output that is pleasing to the eye. That is
why we call 8-bit imagery an output-referring
standard.
Here comes the catch: As soon as you
want to edit a digital image, the pitfalls become
quite apparent.
Especially critical are the shadows and middle
tones. Brightening an image up will reveal
that the gamma encoding has just pretended
to preserve detail in those areas, just enough
so we don’t see the difference on our monitor.
The truth is, the lower EVs did not have
many different digital values to begin with.
Spreading them out across a larger set of values
introduces nasty posterizing effects, most
noticeable in the large blue-brown color fills
that are supposed to be the shaded side of the
mountain. Also, the histogram now looks like
it just went through a bar rumble, exposing
large gaps all over.
 Brightening an 8-bit image with a tone curve
introduces artifacts.
Now this adjustment was done with a very
primitive tone curve. Yes, there are more advanced
tools, using more sophisticated algorithms
to reconstruct the intermediate tonal
values. Such specialty tools might be more or
less successful in filling the gaps, but in the
end all they can do is guesswork. It’s very unlikely
that they will discover new details that
have not been there before. No new pebble or
bush will emerge from the shadows.
The next thing to consider is that what were
the upper stops before are now stacked up on
the right side of the histogram. These tonal
values have snapped together. Once again,
there is no way to separate them now. We even
pushed a lot of them to the maximum value
of 255. Remember, this is a hard limit. Almost the entire sky is just white now, and the clouds
are gone. Now and forever. We dumped the
data.
Any subsequent change has to deal with an
even smaller set of distinguishable values. For
demonstration purpose, let’s try to craft an
inverse tone curve to bring the image back to
what it was before.
OK, looks like our image can only be declared
dead. Bye-bye fluffy clouds. What’s left
in the sky is just some cyan fade, where the
blue channel got clipped off before red and
green. The tree looks like an impressionist
painting, entirely made of color splotches.
And our histogram just screams in pain. Apparently,
we could not get the image back to
the original that easily.
 Inverse tone curve applied, in an attempt to
bring the image back to where it was.
 Original image
for comparison.
More bits: Yes, I know what you are thinking.
It was very unprofessional to make such harsh
adjustments in 8-bit mode. If I am that crazy
about histograms, I should have used the
higher precision that a 16-bit mode off ers.
Indeed, here we have more digital values
available for each EV, and it helps tremendously
when doing such incremental editing steps.
Sixteen-bit is actually 15-bit because the first
bit is used for something else, but that still
leaves us 215 = 32,768 different levels to work
with. Wouldn’t that be it?
And I say no. It doesn’t give me back the
fluffy clouds. It doesn’t even save the sky from
getting burned into cyan. Because 16-bit is
just more of the same. It has a maximum and a
minimum limit, where color values get rudely
chopped off. And even if you don’t make such
hardcore adjustments as I just did, you never
know where a sharpening filter or some other
fine detail processing step will throw individual
channels out of range.
Sixteen-bit is an output-referring format as
well. It carries the gamma-encoded nonlinear
level distribution, so most of the extra precision
gets added in the highlights, where we
need it the least. And it uses a fixed number of
discrete values. There might be more of them,
but that still means our levels get chopped up
into a fixed number of slices. The more you
tweak, the more rounding errors you’ll see and
the more unique values you’ll lose.
Nondestructive editing: Here is a working
solution: Instead of applying multiple editing
steps one after another, we could just
stack them up and apply them all at once to
the original image. That would take out the
rounding errors and preserve the best quality
possible.
Photoshop’s adjustment layers follow this
idea, and other image editors use similar concepts.
The drawback of this approach is that
the flexibility ends at the point where you actually
commit all your changes. But you have
to commit your changes when you take an image
through a chain of multiple programs. An
example would be when you develop a RAW
file in your favorite RAW converter, tweak the
levels in LightZone because you just love the
zone system, and then do additional tweaks
through dodging and burning in Photoshop.
You can dodge and burn all day, but you will
not get details back that have been cut out
by the RAW converter. The farther down the
chain, the more of the original data you’ll lose.
 Splitting the bit.
Splitting the bit: What we really need to do is
reconsider the basic premise that digital imaging
is built upon. We need to split the bit. We
need to use floating point numbers.
See, the hassle started with our limited set
of values, ranging from 0 to 255. Everything
would be easier if we would allow fractional
numbers. Then we would have access to an
infinite amount of in-between values just by
placing them after the decimal point. While
there is nothing between 25 and 26 in LDR,
floating point allows 25.5 as a totally legitimate color value and also 25.2 and 25.3, as
well as 25.23652412 if we need it. HDRI is
that simple yet that revolutionary!
We need no gamma because we can incrementally
get finer levels whenever we need
them. Our image data can stay in linear space,
aligned to the light intensities of the real
world. And most importantly, we get rid of
the upper and lower boundaries. If we want
to declare a pixel to be 10,000 luminous, we
can. No pixel deserves to get thrown out of
the value space. They may wander out of our
histogram view, but they don’t die. We can
bring them back at any time because the scale
is open on both ends.
We’re done with 8-bit anyway. Atari, Amiga,
and DOS computers were based on 8-bit. Windows
and Mac OS are now making the leap
from 32- to 64-bit and so are Intel and AMD
processors. Only our images have been stuck
in the ‘80s.
So, HDR workflow is better than RAW
workflow? Well, they are closely related.
RAW has extended a photographer's capabilities
to digitally access the untreated sensor
data. Most RAW files are linear by nature
and look very much like the first example
picture with the tree. By processing such a
file yourself, you get manual control over how
these linear intensities are mapped to the
gamma-distorted color space. There are other
hardware-dependent oddities to RAW files,
like low-level noise and each pixel representing
only one of the primary colors so that the
full image has to interpolated. Having control
over all these factors is an essential ingredient
for squeezing out the maximum image quality.
However, with RAW you are walking a one-way
road. The processing is usually done first,
and any subsequent editing relies on the data
that is left. Not so with HDR; it preserves everything.
You can edit HDR images, take them
from one program to another, edit some more.
As long as you stick to it, no data will be lost.
RAW is also bound to specific hardware:
the capturing sensor. HDRI is not. HDRI is
a much more standardized base that is truly
hardware independent. You can generate HDR
images that exceed what a sensor can capture
by far, and you can use them in a much wider
field of applications. In the case of CG renderings,
there is no physical sensor that would
spit out RAW images. Instead, HDR images
take the position of RAW files in the CG world.
Think of it like this: HDR imaging is the next
generation of a RAW workflow. Right now,
they go hand in hand and extend each other.
But sooner or later all digital imaging will happen
in HDR.
But what is the immediate advantage?
When a RAW image represents what a sensor
captures, an HDR image contains the scene
itself. It has enough room to preserve all light
intensities as they are. You can re-expose this
“canned scene” digitally as often as you want.
By doing so, you take a snapshot of the HDR
and develop an LDR print.
This technique is called tone mapping, and
it can be as simple as selecting a focused exposure
or as complicated as scrambling up all
the tonal levels and emphasizing the details in
light and shadow areas. You can even simulate
the locale adaptation mechanism that our eye
is capable of. Or you can go to the extreme and
find new, painterly styles. Tone mapping puts
you back in control. You choose exactly how
the tonal values are supposed to appear in the
final output-referring file. None of this slapon
gamma anymore. You decide!
Let me give you a rather extreme example:
A shot of an interior scene that includes a
window to the outside, which is a classic case
for HDR photography. In conventional imaging,
you wouldn’t even try this. There is just
too much dynamic range within the view; it
stretches across 17 EVs. Neither analog nor
digital can capture this in one shot.
This scene is a typical kitchen for visual effects
artists. It is very important that they
see the Hollywood sign while they’re pouring
fresh coffee
 Kitchen at EdenFX,
at three hopeless exposures.
None of the single-shot exposures can
capture this scene entirely. But once they
are merged into an HDR image, exposure
becomes an adjustable value. You can slide it
up and down to produce any of the original
exposures or set it somewhere in between to
generate new exposures.
Or you can selectively re-expose parts of the
image. In this case it was quite simple to draw
a rectangular mask in Photoshop and pull the
exposure down just for the window.
 Manually tone-mapped HDR image can show the
entire scene.
 Impressionist interpretation of the HDR image,
emphasizing texture details.
Other methods are more sophisticated and
allow you to squash and stretch the available
levels to wrangle out the last bit of detail. The
second image was treated twice with the Detail
Enhancer in Photomatix and selectively
blended in Photoshop.
You see, there is an immense creative potential
to be explored here. Tone mapping alone
opens a new chapter in digital photography. There is really no right
or wrong—tone mapping is a creative process,
and the result is an artistic expression. Exactly
that is the beauty of it. Whether you like it or
not is a simple matter of personal taste.
So much for the teaser. The thing to remember
is that every output device has a specific
tonal range it can handle, which can be fully
utilized when the HDRI is broken down. Tone
mapping takes an image from a scene-referred
to an output-referred state. And how far you
have to crunch it down depends on the range
limitation of the final output.
|