What is 4K HDR Tone Mapping?
It's like trying to fit 10 litres of water into a 5 litre container
The arrival of High Dynamic Range has resulted in a slight problem when it comes to the current generation of HDR TVs.The problem is that many TVs can't fully reproduce HDR content and even the best HDR TVs can't handle some HDR content. Why is this? Well HDR content is currently graded using the DCI-P3 wider colour gamut and a peak brightness of either 1,000 or 4,000 nits. There are a few TVs that can now reach nearly 100% of DCI-P3 but many have colour gamuts that are considerably smaller. In addition, although there are some TVs that can reach a peak brightness of over 1,000 nits, most HDR TVs can't go that high and none can get anywhere near 4,000 nits. What this basically means is that a method had to be found to represent HDR material on a TV in a way that retains the content creator's artistic intentions but that also recognises the limitations of the display itself.
If you combine the colour gamut and dynamic range (the difference between black and peak white) used for a piece of HDR content you create what is called the colour volume – which is essentially a three dimensional representation of the HDR content in question. You can also create a colour volume for any TV by combining the colour gamut and dynamic range of that display but, in the case of most TVs, the colour volume of the HDR content is larger than that of the actual display. The image above shows the colour volume of some HDR content on the left and the colour volume of the corresponding HDR display on the right and, as you can see, the colour volume of the HDR content is considerably larger in this particular example.
To use an analogy, if the colour volumes in the example above were jugs of water then the HDR jug would contain ten litres but the TV jug would only be able to hold five litres of water. So the problem is, how do you fit ten litres of water into a container that's only big enough for five? Obviously the simple answer is that you throw away five litres but whilst that might work for water, it isn't as easy when it comes to a TV image. A solution needs to be found whereby five litres can be thrown away, thus allowing the remaining five litres to fit into the TV jug, but at the same time, retaining the essence of that original ten litre jug. That solution is generally referred to as tone mapping.
When a TV reproduces an HDR signal it is applying an algorithm that transposes, or maps, the larger colour volume of the HDR content to the display's smaller native colour volume. The example above shows the larger colour volume of the HDR content being mapped to the smaller colour volume of the HDR display. How that mapping is applied will depend on the nature of the HDR content and the capabilities of the display itself. For example, if a TV can deliver 100% of DCI-P3 and 1,000 nits of peak brightness, then for content graded at 1,000 nits the TV will take a one-to-one approach and simply show the content as it was created. However if that same TV wants to show content graded at 4,000 nits, it will have to map those brighter peak highlights to its 1,000 nits peak luminance.
You may be wondering how the TV even knows what colour gamut or peak brightness was used when the HDR content was graded? The answer to that question brings us conveniently on to metadata. When HDR content is graded by a colourist, using a professional monitor that is capable of 100% of DCI-P3 and either 1,000 or 4,000 nits of peak brightness, this information is included as metadata which can be expressed in one of two ways – as either static or dynamic metadata. In the case of static metadata there are two key numbers, the black level (usually 0 or 0.05 nits) and the peak luminance (usually 1,000 or 4,000 nits), which applies to the entire film. However, when it comes to dynamic metadata, the black level and peak brightness can be changed on a scene by scene and even frame by frame basis.That is why there has been a lot of publicity surrounding dynamic metadata because, the less capable an HDR display, the harder it is to reproduce a good HDR image using static metadata. The reason that tone mapping HDR content using static metadata is much harder, is that the TV must take the maximum colour volume that applies to that material and fit it into the TV’s own colour volume for the entire length of the content. Although some scenes will have very high peak luminance highlights, the majority won't and so the tone mapping ends up over-compressing this much larger number of less demanding scenes. The result is that often HDR content can appear rather dim overall because the tone mapping is transposing less bright scenes based on a peak brightness number that only applies to a small number of actual bright scenes. The tone mapping is essentially over-compensating for the peak luminance number included in the static metadata as in the image above.
MORE: What is Dolby Vision?
Whilst there are industry standards when it comes to the colour gamut (DCI-P3 and Rec. 2020) and PQ EOTF (SMPTE ST2084) that are used for HDR content, the same is not true for tone mapping. The reality is that each manufacturer is free to tone map the HDR content on their HDR TVs in any way that they feel is appropriate. This means that some manufacturers will tone map precisely against the PQ EOTF up to the limit of the display's capabilities. As a result the content creator's artistic intention has been retained but it also means that some scenes can appear overly dark and peak highlights can appear clipped (lose detail). The alternative approach is to deviate from the PQ EOTF and perhaps boost the mid-range of the curve in order to lighten the majority of scenes. Whilst this can result in an image that is less dark overall and avoid clipping, it's also unfaithful to the original intentions of the content creators.These problems associated with static metadata are the main reason for the recent move towards dynamic metadata, either using Dolby Vision or HDR10+. The benefit of dynamic metadata is that it can be adjusted from scene to scene, thus reflecting the fact that certain scenes have a much lower dynamic range, one that the TV in question is perfectly capable of displaying. This means that more of the content is being displayed on a one-to-one basis, as in the image above, and only a few scenes require actual tone mapping. The result will be a superior HDR experience, especially with less capable TVs.
Ultimately tone mapping is a necessary and important solution to the problem of modern video content far exceeding the display capabilities of the current generation of TVs. It will be a long time before consumer TVs can fully display all HDR material exactly as it was graded by the content creators but in the meantime tone mapping provides the best way of ensuring that an HDR TV can display this content to the best of its abilities.
All images courtesy of Dolby.
To comment on what you've read here, click the Discussion tab and post a reply.