Skip to content

Standardize TFLM Reference Kernels to Single Rounding Requantization #3252

@veblush

Description

@veblush

Issue

TFLM's Reference Kernels (and TFLite's one) use a legacy "Double Rounding" algorithm, resulting in an approximately ~1.8% off-by-one error rate. This diverges from the mathematically correct "Single Rounding" used by TFLite's Optimized kernels, primarily XNNPack and ruy, which serve as the "Golden Reference." Context: Single-round was introduced in TFLite by tensorflow/tensorflow#50290 to address tensorflow/tensorflow#25087 in 2021 but it wan't enabled by default due to concerns about regression.

This inconsistency creates a critical "Validation Trap":

  • Validation: Developers validate models using TFLite in Python (correct Single Rounding).
  • Deployment: Deploying to TFLM (incorrect Double Rounding) causes output mismatches, leading to unexpected accuracy drops and engineering churn.

Proposal: Standardize Reference Kernels to Single Rounding

I propose standardizing the Reference Kernel behavior to Single Rounding, ensuring TFLM matches the TFLite behavior.

Action Plan:

  1. Enabling TFLITE_SINGLE_ROUNDING by default in Bazel & Makefile build. (Optionally, we can consider introducing a new switch to use the double-rounding)
  2. Vendor Alignment and Coordination: Propagate this standard to optimized TFLM vendor kernels (e.g., ARM CMSIS-NN, Cadence Xtensa). Coordination is essential to ensure these paths align with the new mathematical standard, maintaining ecosystem consistency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions