Skip to content

Unable to decode some frames with ESP32-p4 (AUD-6336) #5

@digitalLumberjack

Description

@digitalLumberjack

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate
  • Read the documentation to confirm the issue is not addressed there and your configuration is set correctly
  • Tested with the latest version to ensure the issue hasn't been fixed

How often does this bug occurs?

always

Expected behavior

Hello. I'm trying to decode H264 video with ESP32-P4.

The H264 decoder should decode.

Actual behavior (suspected bug)

I encode a video this way :

ffmpeg -y -i gundam.original.mp4 -c:v libx264 -profile:v baseline -preset veryslow -tune fastdecode -vf "format=yuv420p, scale=320:240" -x264opts slices=1 -an gundam.h264

Having this ouput:

[libx264 @ 0x5d78f9973a40] profile Constrained Baseline, level 2.2, 4:2:0, 8-bit
Output #0, h264, to 'gundam.h264':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    encoder         : Lavf61.1.100
  Stream #0:0(und): Video: h264, yuv420p(tv, bt709, progressive), 320x240 [SAR 4:3 DAR 16:9], q=2-31, 23.98 fps, 23.98 tbn (default)
      Metadata:
        creation_time   : 2020-02-05T18:42:37.000000Z
        handler_name    : ISO Media file produced by Google Inc. Created on: 02/05/2020.
        vendor_id       : [0][0][0][0]
        encoder         : Lavc61.3.100 libx264
      Side data:
        cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
[out#0/h264 @ 0x5d78f9975140] video:2724KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.000000%
frame= 2158 fps=326 q=-1.0 Lsize=    2724KiB time=00:01:30.00 bitrate= 247.9kbits/s speed=13.6x 

We can see the video is Stream #0:0(und): Video: h264, yuv420p(tv, bt709, progressive), 320x240 [SAR 4:3 DAR 16:9], q=2-31, 23.98 fps, 23.98 tbn (default) it seems to be what the H264 decoder is excepting.

When I decode the H264 stream with ESP32P4 on a project that quite simple, based on the testing app :

esp_h264_err_t H264Engine::playVideo(uint8_t * file, uint16_t size) {

    esp_h264_dec_in_frame_t in_frame;
    esp_h264_dec_out_frame_t out_frame;
    esp_h264_err_t ret = ESP_H264_ERR_FAIL;
    esp_h264_dec_handle_t dec = nullptr;
    esp_h264_dec_cfg_sw_t cfg;
    cfg.pic_type = ESP_H264_RAW_FMT_I420;
    int frames_decoded = 0;

    ret = esp_h264_dec_sw_new(&cfg, &dec);
    if (ret != ESP_H264_ERR_OK) {
        printf("new failed. line %d \n", __LINE__);
        goto _single_exit_;
    }

    ret = esp_h264_dec_open(dec);
    ESP_LOGI("H264", "esp_h264_dec_open");

    if (ret != ESP_H264_ERR_OK) {
        printf("open failed .line %d \n", __LINE__);
        goto _single_exit_;
    }
    in_frame.raw_data.buffer = file;
    in_frame.raw_data.len = size;

    esp_h264_dec_param_sw_handle_t dec_param;
    esp_h264_resolution_t res;

    while (1) {
        if (in_frame.raw_data.len <= 0) {
            break;
        }
        ESP_LOGI("H264", "decoding");
        ret = esp_h264_dec_process(dec, &in_frame, &out_frame);

        esp_h264_dec_sw_get_param_hd(dec, &dec_param);
        esp_h264_dec_get_resolution(dec_param, &res);
        ESP_LOGI("H264", "res = %dx%d", res.width, res.height);

        in_frame.raw_data.buffer += in_frame.consume;
        in_frame.raw_data.len -= in_frame.consume;
        ESP_LOGI("H264", "consumed %d, now at %d",  in_frame.consume, in_frame.raw_data.buffer);
        ESP_LOGI("H264", "input frame dts: %d, pts: %d", in_frame.dts, in_frame.pts);
        ESP_LOGI("H264", "decoded frames %d", frames_decoded++);

        if (ret == ESP_H264_ERR_OK) {
            ESP_LOGI("H264", "frame decoded size: %d, dts: %d, pts: %d", out_frame.out_size, out_frame.dts, out_frame.pts);
        } else {
            printf("process failed. ret %d line %d \n", ret, __LINE__);
            goto _single_exit_;
        }
        vTaskDelay(1);
    }
    _single_exit_:
        ESP_LOGI("H264", "exiting");

        esp_h264_dec_close(dec);
        esp_h264_dec_del(dec);
    return ret;

  }

Error logs or terminal output

I can decode up to 40 frames but I then I have:

I (2596) H264: decoded 38
I (2598) H264: frame decoded size: 115200, dts: 1341241836, pts: 1073952316
I (2607) H264: decoding
I (2619) H264: res = 320x240
I (2619) H264: consumed 551, now at 1074014023
I (2619) H264: input frame dts: 1341241836, pts: 1073952316
I (2620) H264: decoded 39
I (2623) H264: frame decoded size: 115200, dts: 1341241836, pts: 1073952316
I (2631) H264: decoding

E (2640) H264_DEC: Decode macroblock layer error(210).
E (2640) H264_DEC: Decode slice data error.
E (2643) H264_DEC.SW: Error in decoding

I (2646) H264: res = 320x240
I (2649) H264: consumed 377, now at 1074014400
I (2653) H264: input frame dts: 1341241836, pts: 1073952316
I (2659) H264: decoded 40
process failed. ret -1 line 63 
I (2664) H264: exiting

Steps to reproduce the behavior

I tried with many different videos, many encoder options, but it seems I cannot achieve a entire decoding.

Is there any standard ffmpeg configuration you can provide if it's encoding configuration causing the issue ?
Any other ideas ?

Project release version

component 1.1.1 on idf master 1c468f68259065ef51afd114605d9122f13d9d72

System architecture

Intel/AMD 64-bit (modern PC, older Mac)

Operating system

Linux

Operating system version

Linux ubuntu 24.10

Shell

ZSH

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions