Basic Color Science for Cinema

If you like math and really want to the skinny on it, I give you:  Bruce Linbloom.  In particular, I’d recommend the section on XYZ to RGB to XYZ transformations.

If those really huge square brackets seem scary, it’s all too tempting to cast off the math and fall back on statements such as:

  • It looks digital.
  • It isn’t organic.
  • It’s too clean.
  • It feels dead.  Not alive, like film.
  • Those are candy colors.
  • It looks too much like video.

It’s extremely tempting to speak from behind the opaque veil of divine-artistic-intuition.  But it’s not good to hide from the truth.  If one tries to pierce the veil of divine-artistic-intuition, one is often labeled a heretic and their opinion is cast away by the bourgeois (gab gab gossip gossip gasp glare).  As such, many attempts at straightening the path and getting proper color-science into the mix, tend to fail.  If someone with enough creative credentials steps into the conversation and makes an observation of divine-artistic-intuition, they win (golf clap).  Data, math and science don’t override invalid conclusions in modern Hollywood.

But that’s not really the true heritage of Cinematography.  The Cinematographer has classically been the chief technician of the filmmaking process.  It is painful to see modern digital Cinematographers shy away from the math and science of the image.  Rather, they should be consuming all the available knowledge and pushing the technology further, in pursuit of truly provable and superior image quality.  The digital revolution should allow for deeper inspection. Instead even among digital advocates we seem to be preaching plug-in worship and Acronym worship.  The worst are those that shun digital technology purely because it’s digital.

For example, there are camps of digital photographers who are pro-OLPF and anti-OLPF.  But how many of them can explain how the work of Harry Nyquist factors into that discussion?  Or what OLPF stands for?  Or how a CMOS sensor interacts with it, versus a CCD?

Another example:  I’ve had both cinematographers and digital compositors try to sell me on using expensive de-focus plugins.  It’s true that gaussian blurs don’t model the circle of confusion correctly.  But a simple convolution kernel will do the job right.  No need to believe in expensive super-secret-sauce and strangely named ingredients with mystical powers.

A third example:  People who feel DCP (Digital Cinema Package) projections are inferior to film print projections.  A good DCP of an existing photochemical film is likely made from a scan of an inter-negative (IN) or possibly an inter-positive (IP).  A digital cinema projector has provable and significantly higher resolution than a film-print.  A digital cinema projector is maintained to tighter tolerances than a lab’s release printing regime and resulting variances.  In other-words: both the color and the resolution of a DCP of an existing photochemical film, is provably superior to a film print of the same film (assuming it’s done right).  Yet, I have on multiple occasions observed cinematographers and directors turn their noses up at screenings because they’re DCPs rather than film-prints.  This isn’t ignorance anymore.  It’s a system of fascistic belief and rewards at this point.

It’s true that a camera negative is likely in the range of 6k to 10k in resolution.  But a print is nowhere near that quality.  Testing I’ve done multiple times now, shows print stock to be somewhere between 1k – 1.5k.  A digital cinema projector is at least 2k.  So a scan and digital projection of an IP or IN preserves more clarity than a print of that same media.

The most recent attempt I’ve seen at providing a tome of factual knowledge on the subject of color science in cinema, is the VES White-paper on Cinematic Color.  Full disclosure:  I am a member of the VES and I submitted corrections to the first release of the afore linked paper.

However, if you give it a read, you’ll probably skip 3/4 of it and lose your mind.  I did the first time too.  Further, it doesn’t really present the knowledge you need most, in the best way.  It doesn’t focus on explaining how and where color-science should be applied in a digital-cinema workflow.  What are best practices?  Who needs to know what and when?

Since I know a bit about this kind of thing, I thought I’d write a series of such documents.

The first step, is some reasonable explanation of the basics of color science.  Ultimately, you’ll want to go back and really understand documents like the VES White-paper.  And further, you may even want to go through Bruce’s site a bit as well.  Also, the Digital Cinema Initiative is always a good site to have bookmarked when you’re looking for some hard numbers with regard to digital projection.

How you see

Most color science documents start this way.  I have to as well.  It’s extremely important.  But I’ll keep it practical.

We’ll get to the easy part first.  You see in RGB.  Forget all that nonsense people spout about CMYK and Lab color.  It’s not what we care about here.  You have S M and L cones in your eyes.  They stand for Short, Medium and Long Wavelengths respectively.  They generally map to Red, Green and Blue respectively.  And there is cross-talk across the whole spectrum, where certain wavelengths of light stimulate two kinds of cones.

Also, SMPTE standards for the brightness of film projections sit comfortably right in the middle of the brightness range where your vision is transitioning from rods (low light, b&w and peripheral vision) to cones (bright, color and detailed vision).  Therefore, you can’t really see color and detail in the cinema exactly and as ideally as a pure study of optimal human vision might suggest.  There are further psychological and perceptual ramifications to this.

Those who are intrigued with the HFR version of The Hobbit should take note here.  The DCP tries to boost the lumens dramatically.  And if an individual theater can accomplish it, the effect is dramatic.  It’s not just high frame rate that’s at play in all screenings of the film.  Doug Trumbull (whom I’ve worked with before) often talks openly about the positive effects of very bright projections in ride films.

Anyhow, back to RGB color.  We tend to think of R, G and B as being orthogonal, in that they have no effect on one another individually.  But actually, when it comes to wavelengths of physical light and how they interact with the cones in your eyes there is crosstalk.  But the wavelengths of light that make it up, are orthogonal.  So we treat RGB light orthogonally when we manipulate it.

A side note here.  There has been some recent research into light-field systems that filter true spectral source imagery.  It’s very cool.  But it’s not how most photography and video have worked for the past century or so.  Also, it doesn’t address creating spectral displays.  You’d still need to convert from spectral images into RGB images for display.  Though it does open up interesting possibilities for doing some of the filter magic that is part of the Kodak film stocks.  You can target the spectrum of human skin this way, and color skin differently than you’d color things that happen to be skin color in RGB.  Those kinds of manipulations are a secret behind “Kodachrome” and the Kodak “gold-spike” in negative stocks.  The purpose being: To make you look like you’ve got a tan and radiant skin, when maybe you don’t.  Something vacationers and film-stars both love.

Now, the harder bit.  When you look at grey tones, what you see as half as bright as white, is not actually half as bright as white.  And it’s not just a little off. It’s hugely off.

I will prove this to you the most direct way I can.

Middle-grey is the shade of grey we all generally agree to be half as bright as white.

If you take a white wall in a dark room, and shine a flashlight on it, your eyes will adjust their irises and your brain will adjust its perception, to make that spot be “white.”

If you turn on a second identical flash-light and shine it directly on-top of the spot created by the first (keep the flashlights lined up), it will be twice as bright as it was.  Light is additive (1 + 1 = 2).  Your iris will adjust and your perception system will now consider that spot to be “white.”

Now here’s the trick.  If you take a third identical flashlight, and shine it next to (not on-top of) the first spot, it is half as bright as the first spot ( 1/2 ).  Two flashlights are twice as bright as one.

Therefore, one would think:  If your eyes are accurate, you should see the second spot as middle grey.  You would still see the first as white as long as you stayed focused on it, and kept it in your field of vision.  When you then looked to the second spot, it should be half as bright.  It should look middle grey.

But it doesn’t.  Close one eye.  Still, it doesn’t.  It looks pretty darn bright.  It’s not a little off.  It’s really nearly as bright (maybe 80%) as the one you know to be exactly twice as bright.

Two flashlights are shining together to make the left spot.  One on the right.  Focus on the center areas.  Notice the one on the right doesn't seem half as bright as the one of the left.  It seems brighter than that.
Two flashlights are shining together to make the left spot. Only one is making the spot on the right. Focus on the center areas. Notice the one on the right doesn’t seem half as bright as the one of the left. It seems brighter than that.  Your display is not calibrated.  My camera is doing a little contrast enhancement.  But this should be close enough to reality illustrate.

Your perception of brightness is not linear.  It’s non-linear.  It is VERY non-linear.

In truth, to get the math about right, you’d need six flashlights.  Five stacked to shine on-top of one another, and one to make a spot next to it.  That second spot, which would only be %20 as bright as the one created by the other five flashlights, would look to be about half as bright to you.  Middle grey-ish.

Five flashlights are shining on top of one another to make the spot on the left.  A single flashlight is making the spot on the right.  Focus on the center of the spots.  Notice that in this case, the spot on the right is much closer to feeling about half as bright as the one on the left, even though it's really only one fifth as bright.
Five flashlights are shining on top of one another to make the spot on the left. A single flashlight is making the spot on the right. Focus on the center of the spots. Notice that in this case, the spot on the right is much closer to feeling about half as bright as the one on the left.  Though it’s really only one fifth as bright.

Ever wonder why a grey card is called an 18% grey card?  Because it’s reflecting 18% or 0.18 times the light than a white card does.  It’s done with dyes rather than light sources, in the case of the card.  In this simple example, 20% is close enough to 18% to get it to work well enough.  I don’t want you blowing your budget on 59 flashlights to get an accurate 18%.  Gaffers of the world, I’m sorry if someone demands to see this done accurately for real.

If you light with good old mole-richardson lights from time to time, you kind of know this.  Going from a 500  to a 1k doesn’t do so much.  You often need to go straight to a 2k if you need a significant change in contrast.  And you never have a 2K when you need one if you are budget-constrained enough to be using old small moles.

Most speculation is that our eyes are non-linear in this manner to emphasize our ability to look into shadows in mixed-light-environments, such as you’d find in wooded areas.  Blame the sabertooth tiger and man-eating wolf-packs who hid in shadows.  Or possibly the arthropods, depending on when exactly this adaptation developed.

So I’ll just let you stew on that for a bit.  It’ll probably mess with your head for a few weeks. Everything you see is wrong.  Everything.  You can’t see linearly.  Your entire perceptual system presents the world to you in an extremely biased non-linear way.  And you can’t fix that.

Another thing to consider here:  In photoshop (or even MS Paint), when you pull a color slider to get a middle-grey-ish tone, it usually comes up at 128/255 or around 0.5.  But we just said that actual middle grey is 0.18 or 46/255.  Therefore, something is up with computers and it’s making them more perceptual or intuitive, rather than accurate.  The computer (and really most video works the same here) is in on the big lie somehow.  But we’ll get to that later.  It’s just something else to think about while you curse our now extinct and subjugated natural predators once more for making color-science harder to understand.

Gamma and Log

So, it’s probably been a week.  And maybe you’ve finally come to realize I’m right about the non-linear thing.  And so you’re willing to read on.  If you didn’t spend a week getting used to it, you may not be ready for this section.  The cognitive dissonance should take some time to settle.

Okay, so we see non-linear.  Great.  And yea, the computer (video) seems to be something that’s catering to that reality.  So, why?  How?  Is that good or bad?

I could go into very detailed history here but I’ll paraphrase and simplify it.

Basically, when the early Tele-Vision (transmitted vision, amazing!) engineers standardized NTSC TV, they factored non-linear vision into the technology of TV.  They had limited bandwidth (analog bandwidth) to jam the video signal into.  So if your eyes were biased to see into the darker tones, then they wanted to spend more of their bandwidth in that darker range.  They wanted to spend less bandwidth in the lighter tones where you couldn’t discern so much contrast-detail.

If they didn’t bias the signal this way then the noise that inevitably is a part of analog media systems, would become really apparent to you and they’d have to use more bandwidth to overcome it.  And then there’d be less channels.  And it would need more power.  And well, it would be even more of a mess than it already was.

Further, they were able to use the natural characteristics of the cathode-ray-tube (CRT) to aid in this.  A CRT with phosphors on the front actually has a natural non-linear response to the amplitude of the electron beam that it uses to scan and excite the phosphors to produce light.  That non-linear response is what we call gamma.  It’s exponential.  The gamma variable (isn’t that greek?) is the important one.  And in the case of a CRT, the natural gamma of a CRT is 2.5.  Which means the inverse function is a gamma function with a gamma of 1/2.5 or 0.4.  Many people will tell you that a CRT is gamma 2.2.  They are wrong.  And this is widespread enough of a mistake to be in wikipedia and many books.

The original natural state of a NTSC display system is an uncorrected 2.5.  The hardware to do a correction in the TV set was expensive.  So gamma correction was handled in the camera as a pre-correction. To create perfect reciprocity, where the light coming from the TV is the same as the light on set, one would pre-gamma-correct the linear signal coming from the sensor, via a gamma 0.4 function.  It would remain encoded in a gamma 0.4 space in the signal, until it hit the front of the CRT where it went through a natural gamma 2.5 function and became linear again.  It just so happens that gamma 2.5 is close enough to the non-linear characteristics of our eyes, that this system works really well to distribute the system noise perceptually, rather than linearly.  This hides it better.  It works well enough anyway.  And so we get lots of TV channels jammed into the spectrum easier.

An aside here.  NTSC TV doesn’t seek to create exactly the same light at the display as was in the scene.  It seeks to increase the contrast a little.  This is done because TV is meant to be viewed in a brightly lit space, like a living-room.  So NTSC cameras encoded gamma 0.5 rather than 0.4, in the beginning.  This resulted in a contrast increase or gamma-adjustment for the whole system.  Note the overall-all amount of gamma adjustment here, from scene to display is 1.25.  But we’ll continue the document with the simpler 0.4 assumption.  Even though it’s actually a little off.  Even today in HDTV, there is a mismatch between encoding gamma and display gamma, to increase contrast.  Though in practice, the standard is not always followed.

In the case of an early live sportscast, that’s great.  But what if you need to modify the image after its been put into a gamma space by the camera?  Say, to put titles over it.  Or to adjust the color?

Well, theoretically, you should take the signal and gamma 2.5 it, to make it linear.  Then you can do your image manipulation in a physically correct linear light space.  Then you can re-encode the results back through gamma 0.4 and all is well, to send it out to the TV.

Here’s the problem:  No one really has bothered to do it right for the most part.  They tend to leave it gamma’d and do the math in a gamma space.  Again, it just so happens that such a gamma space is pretty well matched to our non-linear vision.  So when you do fades and color manipulations in the gamma’d space, in some ways they work better with our vision than they do in reality.  A signal strength of 0.5 is about middle-grey-ish.  Pulling the fader bar on an old GVG-250 to a half way point and seeing the image half as bright does feel natural (but is not as fun as pulling the fader bar on a GVG-250 and firing the death-star).

But in other ways, it’s completely wrong.  One of the major reasons why pulling chroma-keys never really worked so hot on a GVG switcher, was because it was doing its math in a gamma’d space, rather than an accurate linear one.  The edges could never be right.  Not even in the ideal situation.  Of course that’s not the only problem with simple chroma keys.

Anyhow, enough with the history here.  Computers are descendent from TV.  Bandwidth becomes quantization.  But it’s the same problem.  And that’s why photoshop is wrong (what!?!?).  Seriously, wrong.  When you do photoshop in 8 bit or 16 bit mode, you’re still operating in a gamma’d space.  Which may be useful for some painting and designing tasks.  But actually, it’s totally wrong.  And modern Visual Effects tools like Nuke actually work in linear-light space by default.  Until recently, even AfterFX was doing it wrong.  And in-fact, most AfterFX artists still do it wrong.  Proper revelations and fixes for this fact are so recent, that most of the motion picture industry still does it wrong.  High-end VFX facilities have had to get it right first.  And hopefully everyone else will follow.  Perhaps sooner rather than later?  Please?

Anyhow, that’s video. Video is gamma.  But what about film?  After all, video clips the highlights of the image off.  Film preserves them.  You can always print the image up or down a couple to a few stops.  The image is in there.  Film is a high-dynamic-range medium.  It captures an ideal white-point and also brightness values higher or beyond that white-point. When we process, print and project film images, we extract a viewable range from the original high dynamic range.  And we can do so with good precision.

A print-stock does have a built-in “look” of S-curve here.  It’s something we tend to like.  It compresses the highlights into a specular roll-off at the shoulder, rather than a hard-clip.  Also, it does a similar crushing of the darker tones at the toe of the image.  But it’s important to note that it is only the print-stock that does this.  It’s not in the negative.  Similarly, when you shoot digital photography, the raw image is the raw sensor data.  But the jpeg has a similar S-curve “look” built into it when it is processed and generated.  When you “develop” a raw image in photoshop, it also adds that kind of “look.”  Though you likely have some control over it in that case.  Print paper that you use in a darkroom is just like print-stock.  It too has an S-curve “look” to it.

The math for gamma doesn’t do so well outside a 0-1.0 domain.  So it’s not really appropriate for encoding the high-dynamic-range of film negative, where by definition, you have values that are whiter than white, or greater than 1.0.

So we needed something else.  And Kodak provided it.  Or well, they went with it.  For the most part, Its modern incarnation is called cineon.  Basically, they use a Log function (base 10, or a common log), which functions better outside the 1.0 range.  There’s a lot of confusion about this.  Many people will tell you that film, the substance, is Log.  That’s not right.  Cineon is log (because of this ANSI doc) .  And since the usual mathematical representation of film we see is typically Cineon in nature, we conflate the two.  But that’s not right.

Well, that’s not entirely right.  If you want to get deep into the math and physics, film is technically working logarithmically internally.  The physics of how light affects the sliver nitrate grains, involves the natural log.  And in reciprocity, the way the sliver blocks (attenuates) light that is projected through the film works with the inverse of the same natural log.  So unless you are inside the emulsion looking at individual crystals, you can’t see the natural log component of the film.  We care about exposing film, and transmitting light through the film.  And those two functions have a linear relationship to one another.  Also, within the scientific community, there is pretty strong agreement that optical density is better expressed as a natural log, rather than a common log.  Because at least then, it’s proportional to the thickness or concentration of a neutral density material.  But that has nothing to do with us making movies.

A densitometer is a device that measures the density of a film.  And a densitometer is… well… linear actually.  It directly measures transmittance.  It’s a lot like your old Sekonic analog light meter.  It shines light out one side and it measures the light that comes through the film and onto a photocell.  And then it converts that linear-light value into something useful… which can be a log value.

The Sekonic’s scale under the needle is designed to produce light measurements in footcandles.  Footcandles are a linear measure of light.  But all the values on the wheel are in a log 2 scale.  Note: that’s not the log 10 of status-M values.  When you line up the wheel based on a log 2 scale of foot-candles, you convert from linear to log.  This way, you can spin the wheel (or lookup values on the wheel radially) and keep reciprocity between f-stops, ISO and shutter-speed.  And you can do so with a nice, regular distance between stops.  But actually, when you open up a stop at a time, you are exposing logarithmically, not linearly.  It just doesn’t quite seem that way to our eyes, because our perception system is not linear.  Our eyes work more similar to a log scale.  And again intuitively, you know this.  Opening or closing a single stop doesn’t do so much with regard to tone.  You usually need to open or close two stops to make a significant tonal shift, equivalent to about half brightness.  Closing down two stops is cutting the light down to 0.25, which is reasonably close to the 0.18 of middle grey.

Just try and make sense of the afore linked ANSI doc to figure out exactly why densitometers use Log 10 but photographers use Log 2.  But let’s be clear here.  Negative stocks see real linear light.  They are contact printed to a print stock.  The print stock sees linear light.  And the light that comes from a print stock when it is projected is roughly linear.  Though print stocks are engineered to add a “look” and mess with contrast in various pleasing ways.  But not so much that it’s not losing a rough linear scene reference.  Otherwise, how would we see the images?  This is why you won’t see many people using Status-A density values.  Status-A values are for prints.  Really, the only time you’ll see them is when checking grey patches on prints.  And that’s just comparing the expected Status-A of that particular print stock, to what actually happened when the film was processed.  It’s quality control.  Not scene reference color.

Status-M densities are the most standardized.  They’re the values as measured on negative, in log base 10.  But again, film itself is not log 10.  It’s just Status-M readings that are run through the log base 10 function.

Putting more light through the negative just makes the near-clear-ish parts brighter, and brings the darker parts into a brighter range.  Reducing the brightness of the printer light does the opposite.

And that’s why you can print film up or down some number of stops (using printer lights or exposure times) without seeing the entire tonal characteristics of the film change.  Because film is actually linear for the most part.

If it were not, then you’d see the contrast shift around dramatically  when you print it up or down a couple of stops.  But anyone who’s messed with this at the lab or in a dark-room knows it’s the grain that gets you here.  The tones don’t go all wonky because you print up or down a stop or two.  When the curve does go all awkward on you, it’s because you’re reaching the limits of the stock’s ability to hold image at all.  It’s not that the film is somehow Log 10 or 2 in nature.

So log is a good way to work with high dynamic range images in a mostly perceptual or mathematical way.  It lets us get away with 10-bits per channel.  But it’s nothing particularly special beyond that.  It doesn’t model anything about the physical substance.  We think it might model our perceptual system better than video gamma does though.

Film, unlike video, has generally always been processed chemically and optically.  So it has not suffered from being processed in a Log mathematical space.  Because, as stated, film actually works in linear.  And when you just work with the substance and with printers and such, you end up working in linear by default.  You want to push the image a stop?  Expose it twice as long, or push twice as much light through it.  You want to dodge and burn the image?  You actually dodge or burn the image, with your hands, under a printer light.

Hey, remember I said photoshop is wrong?  The dodge and burn tools operate in the same wrong gamma space.  That’s why your images get way too saturated when you darken them this way.  Some photographers have noticed this and instead “develop” multiple versions of the image from their raws, and mix them.  This at least generates the darker versions correctly and keeps the saturation from getting out of hand.

But actually, that has changed over the past few decades.  A lot of the world’s most important digital color-timing systems have been doing their color-math in log space, rather than converting to linear first.  And this is wrong.  And it is wide-spread.  And the biggest Hollywood DI facilities have been doing it wrong for a couple of decades now in many cases.  It’s starting to get better with the introduction of the Academy’s ACES standard.  But the fact of the matter is in linear, you multiply by 2.  But in log, it doesn’t work.  Digital color-timing that matches the photochemical color-timing process and lighting itself, should be done in linear.  Only when you really want to do something in Log, because it’s a more perceptual adjustment, should you then explicitly go into Log space to do the manipulation.  And even then, you probably want to pick the Log base.  Not just go with Log 10 by default.

In high-end VFX land, we convert log values into linear values to do our work.  Then we convert back to log to send them back the way they came.  There are still a number of facilities that manage to do some VFX work in log values.  And this is wrong.  There are also a lot of VFX facilities that do video work in gamma.  They are also wrong, for the same reason.

Color Gamut

Color gamut is where things start to get fun.  We say that we see in red green and blue.  But of course, there are many shades of red, green and blue.  So which shades do we see in exactly?  And are they the shades we’re shooting and displaying images in?

CIE 1931 XYZ color space is the absolute color-space in which we can represent any color visible to a human, and oddly, some colors that are not visible to humans. Basically, it was created by measuring human vision and then extrapolating a master space to contain it in.

When we make a camera or a display, and pick a specific red, green and blue primary set to base it on, we are defining an RGB color-space within XYZ space.  And by definition, it can represent all colors within the triangle created by plotting those three primaries in Yxy space.  The big Y is luminance or brightness.  We usually don’t plot it.  little x and little y are the chrominance values.  They’re what we’re normally looking at to define color.

Bruce and I disagree here slightly.  Our iris is variable.  So Luma or brightness of a color isn’t determined until we view it in context and our iris sets a size and it hits our retina.  Luma is relative for cinema color.  When it comes to cameras, a filter also pays no attention to Luma.  And the eventual white-point isn’t defined until later.  So really in those cases, the shape he shows you in his animation should be swept upward in Luma infinitely.  Of course the better way to do that is just to view it in 2D and ignore the Luma.  When you consider that, the triangle rule holds.  Certain operations do have the issue Bruce brings up however.

An RBG camera or display is incapable of expressing colors outside of the triangle or gamut it is built on.  A color that exists outside that triangle, is said to be out-of-gamut.  Usually such a color is captured or displayed, but effectively truncated by the device, and therefore altered by the device.

This is where the de-facto rule for TV comes from: “Never wear red on TV.”  NTSC’s color-gamut has a heck of a time properly representing reds.  Its red primary is just not particularly ideal.  NTSC cameras clip your carefully chosen red clothes into all kinds of other red-like colors.

A movie I worked on had a super-bright cherry-red Ferrari in it.  Digital cinema projectors are able to show it correctly.  And the Adobe RGB composting displays we were working on were able to reproduce it.  But when we did some cleanup work on a spare Autodesk Flame, which uses an HDTV for it’s display, it could not represent the Ferrari right at all.  Luckily, the work we were doing in the flame wasn’t a destructive process.  So it didn’t matter.  But we made it a point to never show the client the work while it was on the Flame.  Because it would look wrong no matter what.

A different vendor on that project showed all of their work on their flame.  Their work was primarily about the car.  So they ended up being forced to screen in our theater with our digital cinema projector and film emulation LUTs.

Another example:  Have you ever noticed the bright green title card: “this picture has been approved for ___ audiences?”  It often looks dramatically different on different displays and projectors.  You may have to be the right age to get this one.  I remember the first time I saw one on a digital cinema projector.  It was the brightest and purest green I’d ever seen.  It was radiant.  My eyes gasped in awe.  Film had never made that green before.  My TV hadn’t either.

A TV’s green primary isn’t wide enough.  Your computer display may be able to handle that kind of green if it’s an Adobe RGB panel rather than an sRGB one.

Digital cinema projectors have a very specific and wide gamut by design.  If you want a good explanation of everything that went into the original choices for digital cinema projectors, look no further than the book: Color and Mastering for Digital Cinema.

The gamut defined originally by DCI was called P3.  For a while, it was an important Gamut.  It has been de-emphasized recently.  Sometimes it’s called the DCI Reference projector.  Sometimes it’s called the SMPTE Reference Projector (and boy is that one confusing because it can be confused with a film projector that way).  P3 can still be very useful when you are looking for a gamut that’s very close to film’s natural printed gamut (which is not actually a gamut due to it being partially dye based.  But it has a mostly triangular shape… and it is still exposed through filters).  And really, P3 has just changed from a target projection gamut, to a minimum acceptable projection gamut.  The DCI is anticipating laser projectors and other technologies that will allow even larger color gamut projection.  So they widened the spec to allow for arbitrary improvements.

Instead of working in P3, it appears the industry is moving toward working in a space called the Academy Color Encoding System (ACES).  There is a lot to ACES.  But of note here, is the RGB color space it defines.

The ACES colorspace is very wide.  It fully encloses both the Rec 709 (HD TV) color space and the P3 color space.  In theory, it fully encloses any digital cinema camera’s color space as well.  ACES is meant to be a universal digital-negative and working space.

You can often map colors from one gamut to another without altering their appearance.  When you gamut map colors you make it such that colors meant to be displayed by a device based on a particular gamut, will look as they should on a different device based on a second gamut.  Only the colors that are mapped correctly for their respective device will look correct at any one time.  Mismatching colors and devices will look incorrect.

If one gamut completely encloses a second gamut, you can gamut map from the smaller gamut into the larger gamut without any loss.  However, you cannot gamut map from the larger gamut into the smaller gamut without loss.

If gamuts partially overlap one another, then a gamut map will be destructive either way.

If you do with this floating point math, you get negative values when colors go out of gamut.  Depending on what you are doing to the light when it’s negative, a round trip may be possible without loss.  The negative numbers will be back to positive when they get back to their original gamut.  The question is if the manipulation you made while they were negative had any meaning in a domain less then 0.

This is why the ACES gamut is so large.  It is meant to enclose anything filmic and never clip (within reason).  Once an image goes ACES, it should never need to go to anything with a smaller gamut right up until it’s projected.  And ideally, in time, projectors will be able to handle all visible color within ACES.  In this way, ACES is future-proof.

Gamut mapping is accomplished by building a 3×3 transformation matrix to accomplish the mapping.  In reality, you build two matrices and concatenate them into one.  You first convert from your current RGB space into XYZ colors.  Then, you convert from XYZ colors into your target RGB space.  The actual math for this can be found on Bruce Lindbloom’s site (scary big square braces and such).

The important thing to take away here though, is that you can always do an accurate gamut map in linear-light as long as you know your source RGB primaries and your destination RGB primaries.  And all it takes is calculating 9 numbers in a 3×3 matrix to do it really.  Proper color management is about knowing what your image is, and where it’s going.

LUTs

LUT stands for Look Up Table.  It’s a computer science thing in this case.  Gamut maps, as described earlier, are simple transforms.  But some color manipulations are really very complex.  They may not even really be meant to be modeled mathematically.  They may be crafted by hand, using many different tools.  Or they may be natural and messy phenomenon.  A LUT is a way to store and apply complex color manipulations.  A LUT doesn’t depend on formulas.  Rather, it tries to store “what happens” to colors when they go through a complex color transform. Then it makes that information available to do again to other images.

A 1D LUT assumes the color transform is orthogonal.  In other-words, it assumes that the R, G and B values don’t affect one another.  A gamma function or more complex contrast curve could be stored in a 1D LUT.  R, G and B can each have their own curve.  So, the differences between the three primary curves can effect more than just contrast.  It can tint (modulate) in different overall-ranges in different ways.

However, a 1D LUT cannot express a change in hue.  It can’t express crosstalk correctly either.

To contain even more complex color transforms, one uses a 3D LUT.  A 3D LUT is sometimes referred to as a color-cube.

A 1D LUT can often hold every possible input color and it’s output color.  A 3D LUT usually cannot.  This is because the memory required to hold a cube of that precision is unrealistically large.  Instead, a 3D LUT stores a sparse lattice of cross points.  When you apply a 3D LUT, you interpolate the cross points.  Typical “sizes” of 3D LUTs include 17x17x17 and 32x32x32.

Generally, one charactarizes a LUT with the following information.

  • input profile curve
  • input gamut
  • purpose/content
  • output gamut
  • output profile curve

So, for example:

gamma 2.2 -> sRGB -> seq1 -> P3 -> gamma 2.6

expects sRGB color at gamma 2.2 (a pc). Executes the coloration for some sequence or another in the movie.  Delivers it as P3 color at gamma 2.6 (a reference digital cinema projector).

or:

filmScan -> log -> seq2 -> p3 -> gamma 2.6

Expects film scanner primaries encoded in log (cineon).  Executes the coloration for a different sequence in the movie.  Delivers it as P3 color at gamma 2.6 (a reference digital cinema projector).

 

Film Emulation LUTs

Film emulations LUTs are LUTs that seek to contain a complex color transform that mimics a flim processing and print run.  Usually, they are built by recording a set of color patches onto a negative, processing that negative, and then printing that negative (optionally going through IP and IN first).  The resulting print then has the color patches read.  Often, that is done with a densitometer.  Though it can also be projected and read with a spectrometer.

If you build a film emulation LUT for a particular print-stock, you can then predict what a recorded and processed image will look like while you are color-timing in a digital system.  This is the principal technology that allows for the Digital Intermediate (DI) process.  Modern film emulation LUTs are exceedingly accurate.  You are far more likely to see print variances (over or under processing) in side-by-side comparison, than actual problems with the LUT, in a modern DI facility.

Film emulation LUTs have classically been treated as super-secret proprietary technology. When in actuality, they just cost a certain amount of money to get made from any stock or another.  You just need to buy the right hardware and software tools.  This has hindered collaboration in the VFX industry in particular.  The proprietary nature of film LUTs is something everyone groans about and yet it still remains the norm in many situations.  Luckily, Sony Image Works released one under an open license as part of their opencolorio project.  I’m not sure they fully meant to.  But they did.  It’s a: film-scan-gamut – log ->  Kodak Vision Print Stock -> P3 – gamma 2.6 LUT.

On Set Color

There are many ways to approach color on set.

If you are shooting digital, the most important thing is to make sure you are capturing raw sensor data.  And further, that you know the color gamut your sensor is shooting in.  Ideally, your camera is not so much calibrated as it is profiled.  Two cameras of the same make and model can have slightly different color filters on their sensors due to manufacturing variances.  They likely target the same gamut.  But they’re both likely off.  What’s best is if you have actual profiles of exactly what they are, so they can be rectified later via gamut mapping.  More modern digital cinema cameras embed this information in their raw files.

The Academy would like a world where you are capturing ACES format directly.  More likely, you are capturing some kind of proprietary or camera specific raw format.  What is critical here, is that your digital laboratory can get your digital negative into an ACES compliant state without messing up your raw data.

If you shoot film, then it’s a little murky.  But it’s worth considering that the film-scanner that will scan the neg both for VFX and the DI, has RGB filters.  And therefore, a gamut can be extrapolated.  The neg can be put in an ACES container correctly.

It’s really important to recognize that when you are on set, you can not truly look at the image correctly.  Looks and prints are meant to be seen in a blacked out environment with a specific brightness.  No matter how well calibrated your display, it’s not in a dark room.  It’s not big.  And it’s not the right overall brightness.  There are ways to adjust the image to try and overcome these issues.  But they’re often more trouble than they’re worth.  Nothing is particularly good at mitigating this factor.  Maybe constructs of duvetyne and Gaffer’s tape will do a good enough job.  I’ll leave it to you to decide if that could be worth it.

Also, you need to choose your primary display carefully if you are going to get very picky about color on set.  Standard HD displays and sRGB computer displays do not have nearly wide enough of a gamut to match digital cinema cameras and projectors.  Ideally your on-set display would be a native P3 display.  More likely, you’ll be able to get an Adobe RGB display.  But most realistically, you’ll just go with standard HD (Rec 709) and know that its gamut is not wide enough.  Whatever it is, make absolutely sure your LUTs are targeting the gamut of your primary display with the correct gamut mapping for that display (if you’re going to pay attention to it, that is).  It’s technically possible to profile your exact display panel and target it directly with a LUT.  But it’s often more reasonable to just pick a good standard gamut, and make sure everything targets that gamut.

In an ideal world, we might start using something like x.v.color to help make a transition out of our Rec 709 dependence (in on-set and editing environments) and into a wider color gamut.  But we don’t seem to be in that world.  The tools don’t really support it correctly just yet.

Another option might be to use boxes from companies like blackmagic design.  Usually you can find one that takes SDI in, applies a user supplied 3d LUT, and outputs to DVI or the like.  If you combine one of these boxes with a wide gamut Adobe RGB display, you can get nice wide gamut color for relatively little money.  Though it’s hard to get really big screen Adobe RGB displays.

Chromatic Adaptation or Color Temperature

Bruce Lindbloom provides the math to execute very high quality color temperature adjustments in linear XYZ space.  Much like gamut mapping, this takes the form of a 3×3 matrix.

You need to map from your RGB color space, into XYZ first.  Then you execute the color temperature adjustment.  Then you map back into your RGB space.

It’s extremely rare to find someone executing this correctly outside of some very specific software.

Camera RAW “developing” utilities tend to get this right.  Such as:

  • Adobe’s camera RAW utilities
  • RED’s RAW tools

I’ve also noticed that Nuke’s color management tool gets it mostly right.  Though it’s a little bugged.

I’ve not seen a color-timing system that did it right in a way that works for all footage however.

So if you’ve had people try to adjust color temperature in the DI for you digitally in the past, it likely wasn’t done right.  We really need to get to the point that we have the discipline to be able to use this kind of color-math correctly.

Work-Print and Editing

Since an AVID system is completely capable of replacing it’s master clips, you can change your approach to color during the editing process as you wish.

You can start with one-light rushed dailies with a terrible LUT (or no LUT at all).

You can move to best-light dailies run through a film emulation LUT (or no LUT at all).

You can move to best-light dailies run through sequence-look LUTs.

You can use a bunch of SDI LUT boxes to work in a wide gamut if you like.  Just make sure everyone is in on that plan.

It’s all scratch media.  So it’s important to recognize that it’s not final color.  It’s likely not even close.  But if it’s important to get it closer, you can always do a color-timing session and then re-import new master clips that get you closer to where you need to be.

Ideally, we’d move to a world where your AVID media is actually just un-timed raw-ish log images, and you are applying Color Decision Lists (CDLs) to your master clips on the fly and en-masse.  You’d be running tracks/sequences/outputs through a small set of sequence LUTs or film emulation LUTs.  That way, you could leave all the shot for shot color balancing to an editorial colorist managing the CDLs.  You could apply looks to sequences or sections easily in the timeline.  You could approach color-timing non-linearly along side non-linear editing in a non-destructive way.  And you’d still be able to pull all that color off of the clips in the final DI, should your colorist want to start from scratch.

Unfortunately, we don’t live in that world with those tools and that workflow just yet.  But it would be nice.  Not only do these tools need to embrace a better workflow.  But they also need to do the math right.  They need to learn to linearize data and apply CDLs in linear.  Color-timing systems need to learn to do this too.  It’s a mess. But it won’t get straightened out until cinematographers demand that it be done right, and learn how to verify that it’s being done right.