Skip to content

Releases: superuser404notfound/AetherEngine

AetherEngine 1.0.0

13 May 15:32

Choose a tag to compare

AetherEngine 1.0.0

First stable release. A video player engine for Apple platforms — drop the package in, hand it a file, get pixels on screen. Built on Swift 6 strict concurrency, LGPL 3.0 with App Store exception.

The engine handles the hard parts (HDR, Dolby Vision, Dolby Atmos, container coverage, codec coverage) and exposes a single render surface plus a handful of async methods. No AVPlayerViewController. No opinionated controls. No analytics. You ship the UI.

Architecture

Two playback pipelines coexist, picked once at load(url:) by the source's video codec and the device's decode capabilities. Hosts see a unified @Published state surface either way.

Native AVPlayer pipeline (default). Demux with libavformat, re-mux on the fly into HLS-fMP4, serve from a local HTTP loopback, point AVPlayer at the playlist. Apple's stack does all decode, HDR / Dolby Vision signaling, audio routing.

Source URL → Demuxer → HLSSegmentProducer → SegmentCache → HLSLocalServer
                                                                  ↓
                                                              AVPlayer
                                                                  ├→ VideoToolbox (HW)
                                                                  └→ AVR (Atmos via MAT 2.0)

Used for HEVC / H.264 in all cases, and for AV1 on devices with HW AV1 decoders (M3+ Mac, iPhone 15 Pro+, future Apple TV chips). Atmos passthrough, Dolby Vision HDMI handshake, HDR10 / HDR10+ / HLG all live on this path.

Software decoder pipeline (gap-filler). Demux, run video through libavcodec (dav1d for AV1, FFmpeg's native VP9 decoder for VP9) into CVPixelBuffers, run audio through libavcodec into CMSampleBuffers, render via AVSampleBufferDisplayLayer + AVSampleBufferAudioRenderer with AVSampleBufferRenderSynchronizer as the master clock.

Source URL → Demuxer ┬→ SoftwareVideoDecoder (dav1d / VP9) → SampleBufferRenderer → AVSampleBufferDisplayLayer
                     └→ AudioDecoder → AudioOutput → AVSampleBufferRenderSynchronizer (drives sync)

Used for codecs AVPlayer's HLS-fMP4 pipeline doesn't accept:

  • AV1 on devices without HW AV1 (all current Apple TV chips, M1/M2 Macs, pre-A17-Pro iPhones). Apple ships dav1d on macOS 14+ / iOS 17+ but it's only reachable via AVPlayer's HLS-fMP4 pipeline when the chip also has HW AV1 — verified empirically.
  • VP9 unconditionally. AVPlayer's HLS manifest parser silently rejects the vp09 CODECS attribute (verified via aetherctl: master.m3u8 + media.m3u8 fetched, then no further requests, item.status stays .unknown). VideoToolbox HW-decodes VP9 fine on A12+, but only outside the HLS pipeline.

Public API

  • AetherPlayerView (UIKit / AppKit) + AetherPlayerSurface (SwiftUI) — single render surface the host embeds. Polymorphic: hosts either AVPlayerLayer (native) or AVSampleBufferDisplayLayer (SW) per session, swapped automatically.
  • engine.bind(view:) / engine.unbind(view:) — engine attaches its active layer to the view automatically.
  • engine.load(url:options:) — single async entry point. Dispatches by codec internally. LoadOptions controls diagnostics-only toggles.
  • Transport: play(), pause(), togglePlayPause(), seek(to:) (async), setRate(_:), stop(), volume.
  • Lifecycle: reloadAtCurrentPosition() rebuilds the pipeline after background suspension.
  • Audio tracks: selectAudioTrack(index:) — mid-playback switch with backend-aware reload, audio-source-stream override propagates through both pipelines.
  • Subtitles: selectSubtitleTrack(index:) (embedded, runs a side demuxer at the playhead), selectSidecarSubtitle(url:) (sidecar SRT / ASS / VTT), clearSubtitle(). Text + bitmap unified via SubtitleCue (body = .text(String) or .image(SubtitleImage)).
  • Capabilities: AetherEngine.displayCapabilities — static snapshot of HDR / DV / HLG support.
  • @Published state: state, currentTime, duration, progress, audioTracks, subtitleTracks, activeAudioTrackIndex, videoFormat, playbackBackend (.native / .software / .none), subtitleCues, isLoadingSubtitles, isSubtitleActive.

Codec coverage

Codec Native path SW path
H.264 / AVC (SDR, HDR10) universal
HEVC / H.265 Main / Main10 (SDR / HLG / HDR10 / HDR10+) Apple TV 2017+ / iOS A9+ / Apple silicon
HEVC + Dolby Vision (P5 / P8.1 / P8.4) same hardware as HEVC; dvh1 / hvc1 track type + dvcC box
AV1 Main / High (P0 / P1, SDR / HDR10 / HDR10+) M3+ Mac, iPhone 15 Pro+ (HW AV1) Apple TV (all generations), older Mac / iPhone without HW AV1
AV1 + Dolby Vision (P10.0 / P10.1 / P10.4) same hardware as plain AV1; dav1 / av01 track type + dvvC box per Apple HLS Authoring Spec DV not engaged on the SW path (rare in real content; AV1+DV almost always pairs with HW-AV1 hosts)
VP9 Profile 0/2 (8/10-bit) all platforms (libavcodec native)
AV1 + DV P7 / DV P8.2 / DV P10.2 (SDR-base) / Profile 11+ refused (unsupportedDVProfile) refused

The native path's AV1 acceptance gates on VTCapabilityProbe.av1Available (strict VTIsHardwareDecodeSupported). Engine's load(url:) dispatch resolves the path per source; hosts don't see the distinction.

HDR pipeline

  • HDR10 / HDR10+ / HLG / Dolby Vision (HEVC P5, P8.1, P8.4 + AV1 P10.0, P10.1, P10.4) all engage the HDMI HDR-mode handshake on the native path.
  • AVDisplayCriteria built from real demuxer-probed format + r_frame_rate / avg_frame_rate (snapped to standard rates: 23.976 / 24 / 25 / 29.97 / 30 / 48 / 50 / 59.94 / 60).
  • Match Content / Match Frame Rate user settings honored.
  • HDR10+ ST 2094-40 metadata stream-copied as user-data-registered ITU-T T.35 SEI NALs; AVPlayer forwards the SEI to the system compositor unchanged.
  • Dolby Vision: codec tag promoted (hvc1dvh1 for HEVC P5 / P8.1, av01dav1 for AV1 P10.0 / P10.1) with the source's dvcC / dvvC box preserved. P8.4 / P10.4 keep the base codec tag and signal DV via SUPPLEMENTAL CODECS so non-DV displays present the HLG base layer.
  • Dolby Vision dual-layer (P7), SDR-base (P8.2 / P10.2) explicitly refused.
  • HDR-to-SDR mapping handled by AVPlayer and the system compositor; no host-side tonemap.

Audio

  • Stream-copy into fMP4 for legal codecs that AVPlayer accepts (AAC, AC3, EAC3 incl. JOC Atmos, FLAC, ALAC, MP3) — bit-exact, no transcode CPU overhead.
  • AudioBridge FLAC fallback for codecs that aren't legal in fMP4 (TrueHD, DTS, DTS-HD MA, PCM, MP2, Vorbis) or that AVPlayer rejects in HLS-fMP4 despite spec-legality (Opus). Decode → S16 PCM → FLAC re-encode. Lossless bed channels; for TrueHD-MAT and DTS-X Atmos sources the object metadata doesn't survive the PCM intermediate.
  • Atmos passthrough preserved via EAC3-JOC stream-copy. The engine emits explicit diagnostics on both success (stream-copy engaged, MAT 2.0 passthrough intact) and on the theoretical downgrade path (WARNING: Atmos downgrade — EAC3+JOC stream-copy rejected by mp4 muxer ...) so silent quality regressions are loud in the log.

Subtitles

Subtitle packets are routed through a side demuxer running at the playhead, decoded inline through avcodec_decode_subtitle2. Results land in a single [SubtitleCue] published list:

  • Text codecs (SubRip / ASS / SSA / WebVTT / mov_text) → SubtitleCue.body = .text(String). ASS override blocks stripped; \N becomes a real newline.
  • Bitmap codecs (PGS / HDMV PGS / DVB / DVD) → .image(SubtitleImage). Indexed pixel plane walked through its palette, premultiplied against alpha, wrapped as CGImage. Position normalised in [0..1] against the source frame so the host scales to any on-screen rect.
  • Sidecar files (separate .srt / .ass / .vtt URL) → selectSidecarSubtitle(url:) opens its own short-lived AVFormatContext, decodes the whole file once, atomically swaps the result into subtitleCues.

The host paints the cues with whatever style and animation it wants.

Seek

  • Native path: AVPlayer's own seek.
  • SW path: pause demux loop → flush decoders + renderer + audio renderer → seek demuxer → set skipUntilPTS so frames between keyframe-before-target and the target are dropped → jump the synchronizer clock atomically via AudioOutput.seekClock(to:rate:) so PTS-stamped samples decoded post-seek align with the master clock.
  • Backward / far-forward scrubs on the native path tear down the HLSSegmentProducer and restart it at the new segment base. Short-range forward scrubs ride the cached segment window without restart.

Streaming & resilience

  • HTTP Range + chunked delegate reads via URLSession. No third-party networking layer; TLS / HTTP-3 / proxies / MDM rules ride for free.
  • Exponential backoff on transient network errors.
  • Background pause / display-link aware lifecycle.

Dependencies

Package License Purpose
FFmpegBuild LGPL-3.0 Slim FFmpeg 7.1 (avcodec / avformat / avutil / swresample / swscale): demux + HLS-fMP4 mux + AudioBridge FLAC encode + SW-path dav1d / VP9 decode + sws_scale YUV → NV12 / P010
VideoToolbox System Native-path video decode (HW where available)
AVFoundation System AVPlayer + AVDisplayManager (native); AVSampleBufferDisplayLayer + AVSampleBufferRenderSynchronizer (SW)
CoreMedia System Sample descriptions, format-description tagging, CMTimebase

Non-goals

  • No built-in UI, controls, transport bar, HUD.
  • No analytics, telemetry, session reporting. Wire your own to the @Published state.
  • No playlist / queue management. Call load(url:) when you want the next one.
  • No subtitle overlay. The engine emits SubtitleCue; the host paints.
  • No Metal shaders. Everything renders through Apple's native display stack.
  • No third-party networking.

Requirements

| | Min |
|---|-...

Read more

Diagnostic bundle for issue #2 (2026-05-09)

09 May 05:45

Choose a tag to compare

Standalone HLS-fMP4 capture from aetherctl against a Jellyfin direct-play DV Profile 8.1 source. See README.md inside the zip for what each file is and how to test on iPhone Safari.

Not a code release. Pure diagnostic artifacts. Will probably be deleted once the underlying issue is resolved.