Skip to content

Conversation

lkang-nuro
Copy link
Contributor

Description

Add vpx decoder (VP8)

Copy link

codecov bot commented Sep 4, 2025

Codecov Report

❌ Patch coverage is 90.16393% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.27%. Comparing base (6047a32) to head (383e71c).
⚠️ Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
pkg/codec/vpx/vpx_decoder.go 90.16% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #648      +/-   ##
==========================================
+ Coverage   44.76%   45.27%   +0.51%     
==========================================
  Files          81       82       +1     
  Lines        5605     5661      +56     
==========================================
+ Hits         2509     2563      +54     
  Misses       2921     2921              
- Partials      175      177       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

status := C.decodeFrame(d.codec, (*C.uint8_t)(&data[0]), C.uint(len(data)))
if status != C.VPX_CODEC_OK {
fmt.Println("Decode failed", status)
panic("Decode failed")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider return values instead of panic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Thanks for the catch. Added an error code.

"github.com/pion/mediadevices/pkg/prop"
)

type Decoder struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All packages under mediadevices/pkg/codec implements the unified encoder interface which reads video/audio from mediadevices/pkg/io/audio.Reader(which reads mediadevices/pkg/wave.Audio)/mediadevices/pkg/io/video.Reader(which reads image.Image) and returns encoded bytes.

// AudioEncoderBuilder is the interface that wraps basic operations that are
// necessary to build the audio encoder.
//
// This interface is for codec implementors to provide codec specific params,
// but still giving generality for the users.
type AudioEncoderBuilder interface {
// RTPCodec represents the codec metadata
RTPCodec() *RTPCodec
// BuildAudioEncoder builds audio encoder by given media params and audio input
BuildAudioEncoder(r audio.Reader, p prop.Media) (ReadCloser, error)
}
// VideoEncoderBuilder is the interface that wraps basic operations that are
// necessary to build the video encoder.
//
// This interface is for codec implementors to provide codec specific params,
// but still giving generality for the users.
type VideoEncoderBuilder interface {
// RTPCodec represents the codec metadata
RTPCodec() *RTPCodec
// BuildVideoEncoder builds video encoder by given media params and video input
BuildVideoEncoder(r video.Reader, p prop.Media) (ReadCloser, error)
}

But this Decoder returns images in the codec specific format and the interface can't be shared among the codecs.
In my opinion, decoder interface should be defined at mediadevices/pkg/codec and the interface should be able to be shared among the codecs.

If this decoder really need to return the data in codec specific format, the name should be something like RawDecoder to leave Decoder for unified API available.

Comment on lines 89 to 107
func (d *Decoder) Decode(data []byte) error {
if len(data) == 0 {
return fmt.Errorf("empty data")
}
status := C.decodeFrame(d.codec, (*C.uint8_t)(&data[0]), C.uint(len(data)))
if status != C.VPX_CODEC_OK {
return fmt.Errorf("Decode failed: %v", status)
}
return nil
}

func (d *Decoder) GetFrame() *VpxImage {
var iter C.vpx_codec_iter_t = nil // initialize to NULL to start iteration
img := C.getFrame(d.codec, &iter)
if img == nil {
return nil
}
return &VpxImage{img: img}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the reason having Decode/GetFrame is that the number of the input byte stream and the output frames aren't always matched.
In this case, it's better to have a similar (symmetric) API with the encoder:
Something like

type VideoDecoderBuilder interface {
  BuildVideoDecoder(r io.Reader, p prop.Media) (video.Reader, error)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, I made the change in the followup commit.

import "C"
import "unsafe"

type VpxImage struct {
Copy link
Member

@at-wat at-wat Sep 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users don't have a safe way to release the data.
Also to make the Decoder interface unified among the codecs, the decoder should return image.Image.

mediadevices/io/video.Reader returns release function along with image.Image.
So, the image implementing image.Image may have C.vpx_image_t internally and have release function written in CGO.

BTW, in Go's naming convention, this would be VPXImage. https://go.dev/wiki/CodeReviewComments#initialisms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @at-wat , as you suggested, I changed the interface to use image.Image, so removed this file.

@lkang-nuro lkang-nuro requested a review from at-wat September 12, 2025 22:35
Copy link
Member

@at-wat at-wat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface looks good! Added a suggestion for handling multiple frames for one decode call without dropping.

Comment on lines 102 to 105
n, err := d.reader.Read(d.buf)
if err != nil {
return nil, nil, err
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io.Reader is discouraged to return zero byte without a non-nil error. https://pkg.go.dev/io#Reader
So, basically io.Reader blocks if there are no remained data.

vpx_codec_decode may decode multiple frames for one vpx_codec_decode call, but the current code reads only one frame for each Read and remained frames will be buffered.

The current code seems to read the all buffered frames when the reader returns n=0, err=nil.
However, the reader usually doesn't return n=0, err=nil and blocks until next data arrival, remained frames never read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bc7653c
I think this should fix reading on such case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @at-wat , good catch.

@lkang-nuro
Copy link
Contributor Author

The interface looks good! Added a suggestion for handling multiple frames for one decode call without dropping.

Thanks @at-wat , all comments are super helpful and quite to the point. It seems all your comments are addressed, so I will go ahead and merge since we have other repo depends on this PR. Meanwhile, if you have any further comments, I am happy to open another PR to address. Thanks for understanding.

@lkang-nuro lkang-nuro merged commit cb394eb into master Sep 17, 2025
7 checks passed
@lkang-nuro lkang-nuro deleted the lkang/vpx_decoder branch September 17, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants