Skip to content

Conversation

@Arctic-beaver
Copy link

Prevent unhandled stream errors on media upload - ERR_STREAM_DESTROYED error

Summary

Under heavy load (100+ concurrent proxy’d connections) media uploads could crash with unhandled stream errors.
When the remote side closed the HTTP response mid-transfer, our readable ended while the file Writable still had in-flight fs.write ops. We called destroy() on the writable, and those pending writes later completed against a destroyed stream → ERR_STREAM_DESTROYED surfaced as an unhandled error.

This PR makes streaming robust by:

  • handling backpressure (drain) on every write,
  • awaiting finish/close before teardown,
  • using safe, idempotent cleanup,
  • and converting the race into a well-defined, awaited shutdown.

Context / Symptoms

Happened during bursts of image sends while proxies were flapping.

Representative logs:

TypeError: terminated: other side closed
  at ... undici ...
caused by: SocketError: other side closed

Followed by a low-level write error:

Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed
  at node:internal/fs/streams:426:23

Full errores:

"type":"Error","message":"Cannot call write after a stream was destroyed","stack":"Error [ERR_STREAM_DESTROYED]: Cannot call write after a stream was destroyed\n at node:internal/fs/streams:426:23\n at FSReqCallback.wrapper [as oncomplete] (node:fs:824:5)\n at FSReqCallback.callbackTrampoline (node:internal/async_hooks:130:17)","code":"ERR_STREAM_DESTROYED"},"msg":"Cannot call write after a stream was destroyed"

15-10 10:43:25.324: {"level":30,"time":1760514205269,"pid":19132,"hostname":"Ghostly","req":{"method":"POST","url":"/messages/send/file?instanceId=112501","query":{"instanceId":"112501"},"params":{},"headers":{"host":"localhost:20001","traceparent":"00-2a7caada3da1b345a5d33642e9125055-4f0bf49785953fef-00","content-type":"application/json; charset=utf-8","content-length":"1512"}}}

Root cause

  • undici aborts the fetch when the remote side closes: we correctly hit catch.
  • Our error path immediately called fileWriteStream.destroy().
  • But previous fileWriteStream.write() calls are non-blocking: they schedule fs.write via libuv. Those callbacks can still fire after destroy().
  • Without an error listener and without awaiting finish/close, the completion of those writes throws on a destroyed stream → unhandled rejection → process crash.

This is a classic race between readable failure and writable pending I/O.


What I changed (high level)

  1. Backpressure-aware writes

    • Wrap writes into async writeChunk(); if .write() returns false, await 'drain' before continuing.
  2. Reliable stream finalization

    • Introduce waitForCloseOrFinishStream(stream) and await it where we previously did .end()/.destroy().
    • Replace imperative destroy() + hope with Promise.allSettled([ wait(encFile), wait(origFile) ]).
  3. Graceful error path

    • On error: try .destroy() (guarded), then await finish/close on involved streams, then unlink temp files.
    • Make cleanup idempotent and exception-safe (try/catch around every step).
  4. Hashing & return semantics

    • Ensure hashing/finalization only after write stream has fully finished (post-finish/close).

Why this fixes it

  • Pending fs.write completions now resolve against a stream that we explicitly let finish (or at least reach close) instead of a just-destroyed instance.
  • All writes respect backpressure; we never overrun the writable’s internal buffer during spikes.
  • Teardown paths are awaited, idempotent, and cannot leak unhandled errors to the global handler.

Load conditions where it used to fail

  • 100+ concurrent sends through proxies
  • Rapid proxy disconnects (“other side closed”) during fetch body consumption
  • Concurrent readable abort while writable still flushing to disk

Risk & compatibility

  • Low behavioral risk; logic is more conservative about awaiting I/O completion.
  • No external API changes.
  • Tested locally under forced aborts and with parallel uploads; now yields clean, handled errors and no process-level crashes.

@whiskeysockets-bot
Copy link
Contributor

whiskeysockets-bot commented Oct 23, 2025

Thanks for opening this pull request and contributing to the project!

The next step is for the maintainers to review your changes. If everything looks good, it will be approved and merged into the main branch.

In the meantime, anyone in the community is encouraged to test this pull request and provide feedback.

✅ How to confirm it works

If you’ve tested this PR, please comment below with:

Tested and working ✅

This helps us speed up the review and merge process.

📦 To test this PR locally:

# NPM
npm install @whiskeysockets/baileys@Arctic-beaver/Baileys#fix/images-stream-writer-error

# Yarn (v2+)
yarn add @whiskeysockets/baileys@Arctic-beaver/Baileys#fix/images-stream-writer-error

# PNPM
pnpm add @whiskeysockets/baileys@Arctic-beaver/Baileys#fix/images-stream-writer-error

If you encounter any issues or have feedback, feel free to comment as well.

Salientekill pushed a commit to Salientekill/Baileys that referenced this pull request Nov 3, 2025
…R_STREAM_DESTROYED

- Add proper stream cleanup with waitForCloseOrFinishStream
- Use Promise.allSettled for safe stream closing
- Fix ERR_STREAM_DESTROYED errors in media processing
- Improve stream lifecycle management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
if (!creds.me) {
node = generateRegistrationNode(creds, config)
logger.info({ node }, 'not logged in, attempting registration...')
//logger.info({ node }, 'not logged in, attempting registration...')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you commenting this


closed = true
logger.info({ trace: error?.stack }, error ? 'connection errored' : 'connection closed')
//logger.info({ trace: error?.stack }, error ? 'connection errored' : 'connection closed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines -945 to -997
let didStartBuffer = false
process.nextTick(() => {
if (creds.me?.id) {
// start buffering important events
// if we're logged in
ev.buffer()
didStartBuffer = true
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove event buffering?

Comment on lines -963 to -1010
if (didStartBuffer) {
ev.flush()
logger.trace('flushed events for initial buffer')
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

export function makeCacheableSignalKeyStore(
store: SignalKeyStore,
logger?: ILogger,
prefix?: string,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need a prefix?


function getUniqueId(type: string, id: string) {
return `${type}.${id}`
return prefix ? `${prefix}-${type}.${id}` : `${type}.${id}`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Comment on lines +492 to +495
try { aes.destroy() } catch { }
try { hmac.destroy() } catch { }
try { sha256Plain.destroy() } catch { }
try { sha256Enc.destroy() } catch { }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't look very safe, we should know if any of these are erroring.

Comment on lines +222 to +230
//logger?.info(
// {
// histNotification,
// process,
// id: message.key.id,
// isLatest
// },
// 'got history notification'
//)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Salientekill pushed a commit to Salientekill/Baileys that referenced this pull request Nov 19, 2025
…ets#1969 to latest Baileys

Applied PRs:
- WhiskeySockets#2067: libsignal wasm
- WhiskeySockets#2057: emit setting events
- WhiskeySockets#1969: improve retry logic

Note: PRs WhiskeySockets#1991, WhiskeySockets#1981, WhiskeySockets#1906, WhiskeySockets#1892 have conflicts with latest Baileys version and were skipped.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants