Convert AVIF images from Identity to YUV BT2020 Matrix Coefficients #2717

cromefire · 2025-03-29T14:44:50Z

cromefire
Mar 29, 2025

I have a bunch of images using CICP values of MC=0/Identity (so RGB stored in YUV), but Windows doesn't seem to be handling that well at all (it not supposed to be just pure red):

It's fine in a browser (can't upload it to github though because of file extension restrictions, but here's an SDR screenshot):

So I've tried to get it converted to YUV BT2020 (so HDR10, TC=9/TC=16/MC=9) and I came as for as to being able to dump it as y4m and the reverse, but I've not been able to get the raw pixels converted to YUV BT2020 from the RGB. I have tried ffmpeg, but it can basically only convert everything, but identity. I've also tried to get an LLM to help me write code for converting those pixels from RGB to YUV BT2020, but the result is not working, as expected...

Is there anyway to easily do this conversion? I don't seem to be able to do it myself manually with my very limited understanding of color science...

For reference this is the last code the LLM cam up with:

convert.py

def process_yuv_planes(y_plane, u_plane, v_plane, bit_depth, chroma_subsampling):
    """
    Performs BGR (in YUV) to BT.2020 NCL Y'CbCr color conversion with proper level offsets and scaling.
    Input YUV planes are interpreted as B, G, R channels respectively (BGR in YUV), assumed to be in the full range [0, (1<<bit_depth)-1].
    Output is BT.2020 NCL Y'CbCr in the limited range [0, (1<<bit_depth)-1] as typically used in video.

    This version incorporates level offsets and proper scaling for standard YUV ranges.
    Gamma handling is still deferred but is a VERY likely next step if this doesn't resolve the issue.
    """
    print(f"Processing YUV planes: BGR (in YUV) to BT.2020 NCL Y'CbCr. Bit depth: {bit_depth}, Chroma subsampling: {chroma_subsampling}")

    # 1. Conversion matrix from BT.2020 RGB to BT.2020 Y'CbCr (Rec.2100-1 coefficients - verified and standard)
    #    These coefficients are for *normalized* RGB in the range [0, 1] to normalized Y'CbCr in the range [0, 1] (approximately, luma might slightly exceed [0,1])
    conversion_matrix_rgb_to_yuv = np.array([
        [0.6274, 0.3293, 0.0433],
        [-0.3726, -0.6207, 1.0000], # Cb Row (U) - Note different coefficients, these are correct for BT.2020
        [1.0000, -0.9414, -0.0586]  # Cr Row (V) - Note different coefficients, these are correct for BT.2020
    ])

    # Debug: Input plane shapes
    print("Debug process_yuv_planes - Input Y plane shape:", y_plane.shape)
    print("Debug process_yuv_planes - Input U plane shape:", u_plane.shape)
    print("Debug process_yuv_planes - Input V plane shape:", v_plane.shape)


    # 2. INPUT IS BGR in YUV planes: Y=B, U=G, V=R
    # 3. REARRANGE INPUT PLANES to BGR order and NORMALIZE to 0-1 range
    bgr_planes = np.stack([y_plane, u_plane, v_plane], axis=-1).astype(np.float64) # Stack B, G, R, and convert to float64
    print("Debug process_yuv_planes - BGR planes shape (after stack):", bgr_planes.shape) # DEBUG
    dtype_max_value = (1 << bit_depth) - 1
    bgr_normalized = bgr_planes / dtype_max_value  # Normalize BGR to [0, 1] range
    print("Debug process_yuv_planes - BGR normalized shape:", bgr_normalized.shape) # DEBUG


    # Reshape to (N, 3) for matrix multiplication, where N = H * W (number of pixels)
    original_shape = bgr_normalized.shape[:-1] # Store original shape (H, W)
    print("Debug process_yuv_planes - Original shape:", original_shape) # DEBUG
    bgr_pixels = bgr_normalized.reshape(-1, 3) # Reshape to (H*W, 3)
    print("Debug process_yuv_planes - BGR pixels shape:", bgr_pixels.shape) # DEBUG


    # 4. APPLY BT.2020 RGB to Y'CbCr CONVERSION MATRIX.
    yuv_pixels = np.dot(bgr_pixels, conversion_matrix_rgb_to_yuv.T) # Transpose matrix for correct multiplication
    print("Debug process_yuv_planes - YUV pixels shape (after matrix multiply):", yuv_pixels.shape) # DEBUG


    # 5.  Y'CbCr LEVEL SCALING and OFFSET (Crucial step for standard YUV ranges!)
    #     BT.2020 NCL Y'CbCr is typically stored in a "limited range" (also called "video range" or "TV range").
    #     Luma (Y') range is typically [0, 1] which maps to [16/255, 235/255] in 8-bit, and proportionally for higher bit depths.
    #     Chroma (Cb, Cr) range is typically centered around 0.5, mapping to [16/255, 240/255] in 8-bit, centered at 128.

    #    However, for simplicity and common practice in digital video (especially with higher bit depths),
    #    let's try a simpler scaling to the full bit-depth range [0, dtype_max_value], but with an offset for Cb, Cr.

    luma_offset = 0  # No offset for Luma for now, range [0, 1] -> [0, dtype_max_value]
    chroma_offset = 0.5 # Chroma is often centered around 0.5 in normalized range. Let's offset by 0.5 before scaling.


    corrected_y_plane = yuv_pixels[:, 0] # Luma is first column - CORRECTED INDEXING - removed second ":"
    corrected_u_plane = yuv_pixels[:, 1] + chroma_offset # Cb with offset - CORRECTED INDEXING
    corrected_v_plane = yuv_pixels[:, 2] + chroma_offset # Cr with offset - CORRECTED INDEXING

    print("Debug process_yuv_planes - Corrected Y plane shape (before reshape):", corrected_y_plane.shape) # DEBUG


    # 6. *** CLAMP and SCALE back to output bit depth (important for valid YUV) ***
    corrected_y_plane = np.clip(corrected_y_plane * dtype_max_value, 0, dtype_max_value).astype(np.uint16 if bit_depth > 8 else np.uint8) # Clip and cast
    corrected_u_plane = np.clip(corrected_u_plane * dtype_max_value, 0, dtype_max_value).astype(np.uint16 if bit_depth > 8 else np.uint8)
    corrected_v_plane = np.clip(corrected_v_plane * dtype_max_value, 0, dtype_max_value).astype(np.uint16 if bit_depth > 8 else np.uint8)

    print("Debug process_yuv_planes - Corrected Y plane shape (after clip/cast):", corrected_y_plane.shape) # DEBUG


    return corrected_y_plane, corrected_u_plane, corrected_v_plane

That code already got the gamma working again I think, but resulted in an image with Green / Purple hue instead of red... (which is slightly better than just replacing the metadata which just results in Green / Purple and destroyed gamma)

Answered by wantehchang

Mar 30, 2025

It shoudl be possible to write a C or C++ program to perform this conversion by calling the avifImageYUVToRGB() and avifImageRGBToYUV() functions.

avifImageYUVToRGB() can be used to convert the "RGB stored in YUV" to RGB.

Then avifImageRGBToYUV() can be used to convert RGB to YUV with BT.2020 matrix coefficients.

The comment in your convert.py script says "Input YUV planes are interpreted as B, G, R channels respectively (BGR in YUV)". But for MC=0/Identity, input YUV planes should be interpreted as G, B, R channels respectively. Are you sure the B, G, R channels order is correct?

View full answer

wantehchang · 2025-03-30T00:35:20Z

wantehchang
Mar 30, 2025
Maintainer

It shoudl be possible to write a C or C++ program to perform this conversion by calling the avifImageYUVToRGB() and avifImageRGBToYUV() functions.

avifImageYUVToRGB() can be used to convert the "RGB stored in YUV" to RGB.

Then avifImageRGBToYUV() can be used to convert RGB to YUV with BT.2020 matrix coefficients.

The comment in your convert.py script says "Input YUV planes are interpreted as B, G, R channels respectively (BGR in YUV)". But for MC=0/Identity, input YUV planes should be interpreted as G, B, R channels respectively. Are you sure the B, G, R channels order is correct?

7 replies

cromefire Mar 30, 2025
Author

Okay, just setting those as properties on the target image seems to work I think? (at least I don't see any visual differences anymore)

Thanks a lot. For reference (and my own convenience), should anyone ever need that I put my code here: https://gitlab.com/cromefire/avif-identity-converter. It may leak memory maybe, but it should work.

wantehchang Mar 31, 2025
Maintainer

@cromefire Glad to hear you made it work. I took a quick look at your code. Here are some comments. (Note that I am not very familiar with Rust.)

main.rs:

There is a copy-and-paste error in the error message below (change "Speed" to "Quality"):

    if args.quality > 100 {
        eprintln!("Speed must not be greater than 100");
        return ExitCode::FAILURE;
    }

helpers.rs:

I did not check this file.

convert.rs:

I assume AvifRgbImage::new_from is based on avifRGBImageSetDefaults(). avifRGBImageSetDefaults() sets rgb->format to AVIF_RGB_FORMAT_RGBA. If you know the input AVIF image does not have alpha, we should change rgb->format to AVIF_RGB_FORMAT_RGB after the avifRGBImageSetDefaults() call.

You are right that you must set the colorPrimaries, transferCharacteristics, matrixCoefficients fields of *converted_image.image appropriately before calling avifImageRGBToYUV(). Those fields are used by the avifImageRGBToYUV() function. The clli field is not used by the avifImageRGBToYUV() function, so you can set the clli field either before or after calling avifImageRGBToYUV().

We recommend setting the autoTiling encoder option to AVIF_TRUE. The default of autoTiling is AVIF_FALSE for backward compatibility. We are looking into chaning the default.

Re: the yuvFormat argument passed to the AvifImage::create() call: image.yuv_format() is 4:4:4 because 4:4:4 is the only YUV format that makes sense for the Identity matrix coefficients. For converted_image, you can consider using 4:2:0 instead. But it is fine to leave it as 4:4:4.

I did not check whether all resources are destroyed after use.

wantehchang Mar 31, 2025
Maintainer

I tried the C (/ Rust) way, but now it all results in 2/2/2, do I have to specify my target CICP values somewhere?

The 2/2/2 come from avifImageCreate(). Yes, you need to specify your target CICP values after calling avifImageCreate(). You are doing that in convert.rs now.

cromefire Mar 31, 2025
Author

Thanks a lot for double-checking. I'm not that well versed in C, so it felt weird to me just setting the values before storing anything in it, but hey if it works, it works.

There is a copy-and-paste error in the error message below (change "Speed" to "Quality"

Oh yeah, thanks.

I assume AvifRgbImage::new_from is based on avifRGBImageSetDefaults(). avifRGBImageSetDefaults() sets rgb->format to AVIF_RGB_FORMAT_RGBA. If you know the input AVIF image does not have alpha, we should change rgb->format to AVIF_RGB_FORMAT_RGB after the avifRGBImageSetDefaults() call.

Good to know. Yes from_new is based on avifRGBImageSetDefaults(), I just encapsulated everything into rust structs with a Drop trait, so memory is automatically handled by the rust compiler. Though the final image seems to lacks any alpha channel, so I assume it'd just help with memory size / speed which I don't care that heavily about for this simple helper tool.

You are right that you must set the colorPrimaries, transferCharacteristics, matrixCoefficients fields of *converted_image.image appropriately before calling avifImageRGBToYUV(). Those fields are used by the avifImageRGBToYUV() function. The clli field is not used by the avifImageRGBToYUV() function, so you can set the clli field either before or after calling avifImageRGBToYUV().

The 2/2/2 come from avifImageCreate(). Yes, you need to specify your target CICP values after calling avifImageCreate(). You are doing that in convert.rs now.

Good to know my intuition was correct.

We recommend setting the autoTiling encoder option to AVIF_TRUE. The default of autoTiling is AVIF_FALSE for backward compatibility. We are looking into chaning the default.

Yeah I also set that as default in the CLI, I just put it in because I think in theory it could give a slightly higher quality output at the expense of being more single threaded? But then again as a first step I should probably try to upgrade libavif and libaom anyways as the versions that libavif-sys ships with are horribly outdated... Maybe an official rust binding for libavif one day 😊? Not having to deal with all the memory things manually would for sure be great.

Re: the yuvFormat argument passed to the AvifImage::create() call: image.yuv_format() is 4:4:4 because 4:4:4 is the only YUV format that makes sense for the Identity matrix coefficients. For converted_image, you can consider using 4:2:0 instead. But it is fine to leave it as 4:4:4.

Yeah I could make it an option, but as I'm (probably) the only one using it anyways and I like it to still be 4:4:4 to get maximum quality, I left it at that intentionally. The images shouldn't have much text, but the file size isn't a huge concern anyways, so why not. Just the recompression alone seems to result in a lower file size in the end (probably because I can take a few seconds to compress it where Steam would probably like to save it rather quickly and/or the new MC being more compressible), even with lossless.

Thanks again with all the help, would have never found any of that without your help.

wantehchang Mar 31, 2025
Maintainer

We can document better which fields of the avifImage struct are used as input to the avifImageRGBToYUV() function.

Re: the alpha channel: If the AVIF input doesn't have an alpha channel, I confirm that the converted AVIF output also doesn't have an alpha channel, even though convert.rs causes opaque alpha samples to be used in the intermediate variables. Since you don't care about memory size / speed, I won't describe the details.

My suggestion of using AVIF_RGB_FORMAT_RGB is more about making it clear that the AVIF input doesn't have an alpha channel.

You are right that turning off tiling could potentially improve the compression ratio and maybe also quality. So you can certainly offer autoTiling as an option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Convert AVIF images from Identity to YUV BT2020 Matrix Coefficients #2717

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 7 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Convert AVIF images from Identity to YUV BT2020 Matrix Coefficients #2717

Uh oh!

Uh oh!

cromefire Mar 29, 2025

Replies: 1 comment · 7 replies

Uh oh!

wantehchang Mar 30, 2025 Maintainer

Uh oh!

cromefire Mar 30, 2025 Author

Uh oh!

wantehchang Mar 31, 2025 Maintainer

Uh oh!

wantehchang Mar 31, 2025 Maintainer

Uh oh!

cromefire Mar 31, 2025 Author

Uh oh!

wantehchang Mar 31, 2025 Maintainer

cromefire
Mar 29, 2025

Replies: 1 comment 7 replies

wantehchang
Mar 30, 2025
Maintainer

cromefire Mar 30, 2025
Author

wantehchang Mar 31, 2025
Maintainer

wantehchang Mar 31, 2025
Maintainer

cromefire Mar 31, 2025
Author

wantehchang Mar 31, 2025
Maintainer