Skip to content

Restore ResampledTSDF intermediate object on v0.2-integration #458

@R7L208

Description

@R7L208

Context

The resample refactor (#428, commit d7e2446) removed the _ResampledTSDF class and instead stored resample metadata (resample_freq, resample_func) as mutable attributes on the regular TSDF class. This creates a corruption vector: any TSDF operation that chains after resample() propagates the metadata blindly through __withTransformedDF(), meaning operations like .filter(), .union(), or .withColumn() can silently invalidate the resample context while still carrying the metadata forward to .interpolate().

A proposal for a proper ResampledTSDF intermediate object was written (commit 437e3b1) but never implemented. This issue tracks restoring that pattern.

The Problem

Current state on v0.2-integration:

# TSDF.__init__() accepts resample metadata (tsdf.py:127-128)
resample_freq: Optional[str] = None,
resample_func: Optional[Union[Callable, str]] = None,

# resample() returns a regular TSDF with metadata attached (resample.py:452-457)
return TSDF(
    enriched_df,
    ...
    resample_freq=freq,
    resample_func=func,
)

# __withTransformedDF() blindly propagates metadata to all derived TSDFs (tsdf.py:163-164)
resample_freq=self.resample_freq,
resample_func=self.resample_func,

This means the following produces silently wrong results:

# Metadata propagates through filter — interpolate trusts stale metadata
tsdf.resample(freq="min", func="mean").filter(...).interpolate(method="linear")

Proposed Solution

Introduce a ResampledTSDF class that acts as a restricted intermediate object, following the same pattern as Apache Spark's GroupedData:

Spark Pattern Tempo Pattern
df.groupBy("key")GroupedData tsdf.resample(freq, func)ResampledTSDF
GroupedData.agg(...)DataFrame ResampledTSDF.interpolate(...)TSDF
GroupedData.filter(...)AttributeError ResampledTSDF.filter(...)AttributeError

Key Changes

  1. Create ResampledTSDF class — restricted wrapper exposing only valid post-resample operations (interpolate(), as_tsdf(), show())
  2. Update TSDF.resample() — return ResampledTSDF instead of TSDF
  3. Remove resample_freq/resample_func from TSDF.__init__() — metadata lives only on ResampledTSDF, never on TSDF
  4. Remove metadata propagation from __withTransformedDF() — no more stale state

Valid Usage

# Chain resample → interpolate (primary use case)
result = tsdf.resample(freq="min", func="mean").interpolate(method="linear")

# Get resampled data without interpolation
resampled = tsdf.resample(freq="min", func="mean").as_tsdf()

# Inspect before interpolating
resampled = tsdf.resample(freq="min", func="mean")
resampled.show()
result = resampled.interpolate(method="linear")

Invalid Usage (Now Prevented)

# AttributeError — operations not available on ResampledTSDF
tsdf.resample(freq="min", func="mean").filter(...)
tsdf.resample(freq="min", func="mean").withColumn(...)

# If you need those operations, finalize first (explicit opt-out of safety)
tsdf.resample(freq="min", func="mean").as_tsdf().filter(...)

Why This Matters

  • Prevents silent data corruption — invalid operation chains fail loudly instead of producing wrong results
  • Type safety — IDE autocompletion only shows valid operations after resample()
  • Self-documenting — the class name and restricted API indicate the expected workflow
  • Precedent — this is exactly how Spark handles GroupedData and for the same reasons

Git History Reference

Commit Description
437e3b1 Proposal document for ResampledTSDF intermediate object pattern
d7e2446 Resample refactor (#428) — removed _ResampledTSDF, added metadata attrs to TSDF
ec4fe38 Original refactor that removed _ResampledTSDF class

Implementation Checklist

  • Create ResampledTSDF class (in tempo/resampled.py or tempo/resample.py)
  • Update TSDF.resample() return type to ResampledTSDF
  • Remove resample_freq / resample_func from TSDF.__init__() and __withTransformedDF()
  • Update TSDF.interpolate() to require explicit freq/func args (called internally by ResampledTSDF.interpolate())
  • Add tests for ResampledTSDF (valid chains, invalid chains, as_tsdf() escape hatch)
  • Update existing resample/interpolation tests
  • Update documentation and migration guide

Related

  • Proposal doc: commit 437e3b1

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions