Converting Y'CbCr to R'G'B'
Charles Poynton
circa 1989 [edited 2005-04-28]
This note describes how to convert digital Y'CbCr to digital R'G'B'.
Y'CbCr is often sampled with non-square pixels. For example, CCIR Rec. 601
specifies 525/59.94/2:1 sampling at 720x484 and 625/50/2:1 sampling at
720x576. If the end result of the decoding is to have square pixels, this
can be accomplished by horizontal resampling, and you might as well do it
now. The picture in both cases has an aspect ratio of 4:3 so the 720
samples of 525/59.94/2:1 needs to be turned into 644; a cheap way to do
this is to drop every tenth pixel for 9/10 decimation. The 720 samples of
625/50/2:1 can be turned into the 768 required for square pixels by
repeating every 16th pixel for interpolation by 16/15. This
cheap-and-dirty resampling will introduce resampling artifacts into
high-quality video, and higher picture quality will be achieved with
better-quality resampling filters.
Sometimes 4:2:2 coding is loosely called "Y'UV" but this is incorrect. "U"
and "V" refer to B'-Y' and R'-Y' scaled by 0.493 and 0.877 respectively so
as to limit the COMPOSITE NTSC or PAL excursion, after chroma modulation
and the addition of luminance, to the range -33 IRE to +133 IRE. This is
the scaling that applies to decoding R'G'B' from Y'UV components separated
from composite video, but COMPONENT colour difference signals are scaled
differently.
Cb and Cr refer to colour difference components scaled for unity excursion
+-0.5: Cb is (0.5/0.886)*(B'-Y') and Cr is (0.5/0.701)*(R'-Y'). Pb and Pr
are scaled AND OFFSET for unity excursion 0 to +1. In the digital standard
CCIR Rec. 601 and its derivatives, the digital coding for Y' has black at
code 16 and white at code 235; the coding for Cb and Cr is offset binary
with zero at code 128 and the extrema at codes 16 and 240.
Cb and Cr are commonly subsampled horizontally by a factor of two with
respect to luminance (Y'), that is, you get all the even-numbered samples
but not the odd. This is so-called "4:2:2" coding. You may have so-called
"4:4:4" sampling with as many Cb's and Cr's as Y' in which case no work
need be done here, but if not, you'll have to interpolate the missing
samples. Debug or handle regular VHS-quality video by just replicating;
linear interpolation such as Cb[i] = (Cb[i-1]+Cb[i+1])/2 will be better;
but for highest quality you need a suitable FIR filter. Try
Cb[i] = (160*(Cb[i-1]+Cb[i+1])
-48*(Cb[i-3]+Cb[i+3])
+24*(Cb[i-5]+Cb[i+5])
-12*(Cb[i-7]+Cb[i+7])
+6*(Cb[i-9]+Cb[i+9])
-2*(Cb[i-11]+Cb[i+11]))/256;
This is the general idea; I'll leave the optimization to you. Watch out for
signs, scaling, overflow, and the end conditions; probably best to
surround the picture data with zero samples.
Second, you need to scale "Cb and Cr" to real B'-Y' and R'-Y'. The
standard terminology says that Cb is (0.5/0.886)*(B'-Y') and Cr is
(0.5/0.701)*(R'-Y'), this makes Cb and Cr excurse +-0.5, but sometimes
people play a little loose on the nomenclature and scale Cb and Cr
differently. To check, just digitize 100% colour bars; Cb and Cr will
reach their positive/negative maxima on blue/yellow and red/cyan
respectively; theoretically this should be +-0.5 so that the inverse of
the above scaling results in B'-Y' having extrema +-0.886 and R'-Y' having
extrema +-0.701. In any case, un-scale to put the B'-Y' and R'-Y' maxima
where you want them, for example if you are coding into R'G'B' between 0
and 255 then you'll need to scale B'-Y' into the range +-0.886*255 or
+-226.
Third, you may or may not need to scale luminance. If Y' comes in with the
same excursion that you want for R'G'B' out, then no work is necessary.
However if Y' comes in with black at 16 and white at 235, and you want
R'G'B' between 0 and 255 out, then you'll need to scale Y' up accordingly.
Fourth, now you've got Y', B'-Y', and R'-Y' so you can do the obvious to
get B' and R'. If you solve for G' in terms of what the encoder did, that
is Y'=0.299*R'+0.587*G'+0.114*B', you find that
G'=(Y'-.114*B'-.299*R')/.587
Fifth, in television [and in computer graphics!] the cathode of the CRT
performs light=voltage^gamma (where 'gamma' has the value 2.2 or so).
Therefore the first thing the Y'CbCr ENcoder did was to raise each of R',
G', and B' to the 1/2.2 power, that is, something between a square and a
cube root. You have four cases in decoding:
(a) If you just want to place R'G'B' into a frame buffer for display on an
RGB monitor, then use the decoded gamma-corrected R'G'B's directly with a
unity (linear) lookup table; the monitor will un-gamma for you.
(b) If your lookup table is already set up to take linear light and
gamma-correct for the monitor, then raise to the 2.2 power to undo the
gamma correction that has already taken place, but understand that the
cascaded gamma-ungamma-gamma-ungamma operations will erode your
signal-to-noise ratio really seriously.
(d) With any other colour map, all bets are off. For an eight-bit
colour-mapped system, you'll have to code into the available colourmap,
derive a suitable colourmap for this picture, and possibly dither in some
manner.
You should confirm that each step is incapable of overflowing any
intermediate result. But additionally it is prudent to clamp the final
R'G'B's betwen 0 and 255, because there are Y'CbCr combinations which have
each of the three of Y'CbCr within their respective limits, but generate
illegal R'G'B' combinations.
--
Charles Poynton
vox: +1 416 486 3271
fax: +1 416 486 3657
poynton@poynton.com [preferred, Mac Eudora, MIME, BinHqx]
--