Skip to content

Fix CSV export causing prod restarts by streaming response#707

Open
Saurabhsing21 wants to merge 2 commits intojlab-sensing:mainfrom
Saurabhsing21:export-csv
Open

Fix CSV export causing prod restarts by streaming response#707
Saurabhsing21 wants to merge 2 commits intojlab-sensing:mainfrom
Saurabhsing21:export-csv

Conversation

@Saurabhsing21
Copy link
Copy Markdown
Contributor

Fix : #668

Summary

Fixes the production “Export to CSV” crash/restart by replacing the old in-memory CSV generation flow with a true streaming CSV response from the backend. Frontend is updated to download the streamed file directly as a blob (no Celery status polling).

Problem

Clicking Export to CSV in production caused the site to become unresponsive and the backend to restart (Gunicorn worker timeout / SIGKILL, likely OOM). The previous implementation built large datasets/CSVs in memory.

Solution

  • Backend GET /api/cell/datas now streams CSV rows incrementally (constant memory footprint).
  • Heavy lifting moved to the database (SQL date_trunc + avg for resampling) instead of pandas/in-memory merges.
  • DB results are iterated in chunks (yield_per + streaming execution options) to avoid buffering large result sets.
  • Frontend downloads the CSV directly with responseType: blob and uses the Content-Disposition filename.

Key Changes

Backend

  • Reworked CSV export path in backend/api/resources/cell_data.py to:
    • stream rows (Response(stream_with_context(...)))
    • merge TEROS + power + phytos31(voltage) streams by timestamp
    • fill missing values as "void"
  • Added backward-compat fields for TEROS object units:
    • vwc_unit="%", raw_vwc_unit="raw" in backend/api/models/teros_data.py
  • Test harness improvements for external Postgres:
    • backend/tests/conftest.py supports TEST_SQLALCHEMY_DATABASE_URI
    • session cleanup to prevent “idle in transaction” locking between tests

Frontend

  • frontend/src/services/cell.js: request /api/cell/datas as blob, parse filename from Content-Disposition
  • frontend/src/pages/dashboard/components/DownloadBtn.jsx: simplified export flow to a single streamed download

Tests

Automated

  • Backend: python3 -m pytest -q backend/tests
    • Result: 57 passed, 7 skipped
    • Note: backend/tests/test_ttn.py is skipped unless real TTN_API_KEY / TTN_APP_ID are provided (integration tests hit the real TTN API).
  • Frontend:
    • npm run lint (pass)
    • npm test -- --run (pass)
    • npm run build (pass)

Manual / Runtime

  • Large CSV export regression test:
    • Exported ~15MB (~95k rows) across many cells
    • Response streamed (Transfer-Encoding: chunked)
    • Completed without Gunicorn WORKER TIMEOUT / SIGKILL / restart

- Replace pandas/in-memory CSV generation with streaming generator in /api/cell/datas
- Push resampling/aggregation into SQL (date_trunc + avg) and iterate DB results in chunks
- Update frontend to download CSV as a blob and stop polling /api/status for export results
- Add/expand tests to cover multi-cell export, void-filling, and hour resampling

Signed-off-by: Saurabhsing21 <saurabhsingh881888@gmail.com>
@Saurabhsing21
Copy link
Copy Markdown
Contributor Author

Hey @aleclevy can you review this when you are free

@CODEAbhinav-art
Copy link
Copy Markdown
Contributor

Hey @Saurabhsing21, just wanted to inform you so you don't waste time debugging the CI logs: those failures aren't from your code!

There's a global lint error in main right now, and the backend tests are deadlocking on everyone's PRs. I've got a hotfix open for the linter and flagged the hanging tests to the maintainers.

@jmadden173
Copy link
Copy Markdown
Contributor

At a quick glance the approach seems reasonable. I put both me and Alec on as reviewers. Its currently finals week for us so we are both bandwidth limited atm. Expect a review early next week by Tuesday from one of us.

@CODEAbhinav-art
Copy link
Copy Markdown
Contributor

Hey @Saurabhsing21, as the maintainers are busy with finals next week, so to get the CI pipeline unblocked for your PR and everyone else's, I extracted your conftest.py fix into a standalone hotfix PR! I made sure to credit you for the fix. Once they merge it, your backend tests should finally turn green!
Thanks!

@Saurabhsing21
Copy link
Copy Markdown
Contributor Author

At a quick glance the approach seems reasonable. I put both me and Alec on as reviewers. Its currently finals week for us so we are both bandwidth limited atm. Expect a review early next week by Tuesday from one of us.

Sure, Thanks

@Saurabhsing21
Copy link
Copy Markdown
Contributor Author

Hey @Saurabhsing21, as the maintainers are busy with finals next week, so to get the CI pipeline unblocked for your PR and everyone else's, I extracted your conftest.py fix into a standalone hotfix PR! I made sure to credit you for the fix. Once they merge it, your backend tests should finally turn green! Thanks!

Ok , thanks

jmadden173 added a commit that referenced this pull request Mar 21, 2026
hotfix: resolve pytest db deadlock (extracted from Saurabhsing21 PR #707
@jmadden173
Copy link
Copy Markdown
Contributor

/deploy

@github-actions
Copy link
Copy Markdown

PR deployed to development server

Commit: 707
SHA: 3a90c16

Triggered by: @jmadden173

Copy link
Copy Markdown
Contributor

@jmadden173 jmadden173 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is sound and checked on our dev server. Couple comments and TODOs before merging. Let me know if anything needed clarification.

TODO:

  • Remove these lines before we merge to say that we've fixed the issue.
    <Box
    sx={{
    backgroundColor: '#d32f2f',
    color: 'white',
    px: 2,
    py: 1.5,
    textAlign: 'center',
    fontFamily: 'sans-serif',
    }}
    >
    CSV export is currently non-functional. See the issue for updates:{' '}
    <a
    href='https://github.com/jlab-sensing/ENTS-backend/issues/668'
    target='_blank'
    rel='noreferrer'
    style={{ color: '#ffffff', textDecoration: 'underline', fontWeight: 'bold' }}
    >
    GitHub Issue #668
    </a>
    </Box>
  • Update the changelog with a link to the issue
  • Merge from upstream to fix conflicts and linting issue.

Comment on lines +164 to +168
sensor = Sensor.query.filter_by(
name="phytos31",
measurement="voltage",
cell_id=cell_id,
).first()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the single "phytos31" sensor, it should query the available sensors and data from the table and add the necessary columns.

Comment on lines +20 to +35
CSV_HEADERS = [
"cell_id",
"cell_name",
"timestamp",
"vwc",
"temp",
"ec",
"raw_vwc",
"v",
"i",
"p",
"data",
"measurement",
"unit",
"type",
]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CSV headers are not fixed. Based on how many "sensors" there are it will dynamically change the number of columns.


# Ensure large exports don't get fully buffered in memory by the DB driver.
result = db.session.execute(stmt.execution_options(stream_results=True)).yield_per(
1000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets make the 1000 a top level variable in the module that can be tuned if we run into issues in the future.

@jmadden173 jmadden173 added the awaiting-changes For PRs requiring additional changes. label Mar 21, 2026
@aleclevy
Copy link
Copy Markdown
Contributor

aleclevy commented Apr 9, 2026

@Saurabhsing21 Any way you could take a look at these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting-changes For PRs requiring additional changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Production website hard restarting from export to csv

4 participants