Skip to content

Comments

feat(security): Secure audio file uploads with magic byte validation#76

Open
Krosebrook wants to merge 1 commit intomainfrom
sentinel-secure-file-upload-9125669330596242450
Open

feat(security): Secure audio file uploads with magic byte validation#76
Krosebrook wants to merge 1 commit intomainfrom
sentinel-secure-file-upload-9125669330596242450

Conversation

@Krosebrook
Copy link
Owner

@Krosebrook Krosebrook commented Feb 21, 2026

Implemented magic byte detection for audio file uploads to prevent spoofing and ensure only valid audio files are processed. This mitigates Insecure File Upload risks.


PR created automatically by Jules for task 9125669330596242450 started by @Krosebrook


Summary by cubic

Add magic-byte validation to audio uploads in /api/audio/generate-with-reference to block spoofed MIME types and only process real audio files. Builds data URLs with a trusted, detected MIME type to mitigate insecure file upload risks.

  • Bug Fixes
    • Validate uploads by inspecting file buffers with detectAudioFormat; reject unknown/unsupported formats.
    • Use trusted MIME types (wav, mp3, webm, mp4, ogg) instead of user-provided file.mimetype when creating the data URL.

Written for commit f487daa. Summary will update on new commits.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced audio format validation for reference audio uploads in the audio generation endpoint. Unsupported audio formats are now properly detected and rejected with an error, preventing processing of invalid files.

Implement magic byte validation for file uploads in `/api/audio/generate-with-reference`.
Previously, the endpoint relied on user-provided `file.mimetype` which could be spoofed.
Now, `detectAudioFormat` is used to inspect the file buffer and verify the actual content type.
The endpoint rejects unknown formats and maps detected formats to trusted MIME types.

Fixes: Insecure File Upload vulnerability
Verification: Verified with `reproduce_upload_vuln.ts` script.

Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 21, 2026 05:41
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Krosebrook, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the security posture of audio file uploads by introducing a robust validation mechanism. It leverages magic byte detection to accurately identify the format of uploaded audio files, mitigating risks associated with insecure file uploads and ensuring that only valid and expected audio content is processed by the system.

Highlights

  • Enhanced Audio Upload Security: Implemented magic byte detection for audio file uploads to validate their true format, preventing spoofing and ensuring only legitimate audio files are processed.
  • Robust File Validation: Introduced a new detectAudioFormat utility to inspect file buffers and determine the actual audio format, improving the reliability of file type identification.
Changelog
  • server/routes.ts
    • Imported the detectAudioFormat utility for audio file validation.
    • Added a new validation step for audio file uploads, using magic bytes to determine the file's true format.
    • Implemented a mapping from detected audio formats (e.g., WAV, MP3) to their standard MIME types.
    • Modified the dataUrl generation to use the validated MIME type, enhancing security and reliability.
Activity
  • PR created automatically by Jules for task 9125669330596242450 started by @Krosebrook.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 21, 2026

📝 Walkthrough

Walkthrough

The pull request adds audio format validation to the /api/audio/generate-with-reference endpoint in server/routes.ts. It detects the audio format from the uploaded reference audio buffer, maps the format to a MIME type via a local map, and rejects unsupported formats with a 400 error before processing.

Changes

Cohort / File(s) Summary
Audio Format Validation
server/routes.ts
Introduces format detection using detectAudioFormat() for reference audio uploads, adds MIME type mapping via mimeMap, replaces file.mimetype with computed mimeType in data URL construction, and adds early validation errors for unsupported formats.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 Hops of joy for audio so bright,
Formats detected, MIME types set right,
Unsupported sounds? A 400 we send,
Reference audio, now validated end-to-end! 🎵

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: implementing magic byte validation for audio file uploads to enhance security.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sentinel-secure-file-upload-9125669330596242450

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f487daa0d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1152 to +1154
const format = detectAudioFormat(file.buffer);
if (format === "unknown") {
return res.status(400).json({ message: "Invalid or unsupported audio format" });

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve AAC/FLAC support in magic-byte validation

The new detectAudioFormat check rejects any format it cannot detect, and this now blocks AAC/FLAC uploads even though the route still explicitly allows audio/aac and audio/flac in the multer fileFilter. In this endpoint those files used to pass through with file.mimetype, but after this commit they always hit the unknown branch and return 400, so users sending AAC/FLAC references lose previously working behavior.

Useful? React with 👍 / 👎.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="server/routes.ts">

<violation number="1" location="server/routes.ts:1152">
P2: Magic-byte validation now rejects FLAC/AAC files that are still allowed by the fileFilter, causing valid uploads to fail. Either extend detectAudioFormat/mimeMap to cover FLAC/AAC or remove those mimetypes from the allowed list to keep validation consistent.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

}

// Sentinel: Validate audio format using magic bytes
const format = detectAudioFormat(file.buffer);
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Magic-byte validation now rejects FLAC/AAC files that are still allowed by the fileFilter, causing valid uploads to fail. Either extend detectAudioFormat/mimeMap to cover FLAC/AAC or remove those mimetypes from the allowed list to keep validation consistent.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At server/routes.ts, line 1152:

<comment>Magic-byte validation now rejects FLAC/AAC files that are still allowed by the fileFilter, causing valid uploads to fail. Either extend detectAudioFormat/mimeMap to cover FLAC/AAC or remove those mimetypes from the allowed list to keep validation consistent.</comment>

<file context>
@@ -1147,13 +1148,34 @@ Also suggest a fitting title for the song.`;
       }
 
+      // Sentinel: Validate audio format using magic bytes
+      const format = detectAudioFormat(file.buffer);
+      if (format === "unknown") {
+        return res.status(400).json({ message: "Invalid or unsupported audio format" });
</file context>
Fix with Cubic

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces magic byte validation for audio file uploads in the music generation route, which is a significant security improvement over relying on user-provided MIME types. The implementation correctly uses a helper function to detect the format and maps it to a safe MIME type for further processing. I have suggested a minor refactor to simplify the validation logic and improve maintainability. Additionally, I noticed that the underlying magic byte detection for WAV files in the helper library could be more robust (checking for the 'WAVE' marker in addition to 'RIFF'), which would further enhance security.

Comment on lines +1152 to +1170
const format = detectAudioFormat(file.buffer);
if (format === "unknown") {
return res.status(400).json({ message: "Invalid or unsupported audio format" });
}

// Map format to mime type
const mimeMap: Record<string, string> = {
wav: "audio/wav",
mp3: "audio/mpeg",
webm: "audio/webm",
mp4: "audio/mp4",
ogg: "audio/ogg",
};
const mimeType = mimeMap[format];

// Should not happen if detectAudioFormat only returns the above types + unknown (handled above)
if (!mimeType) {
return res.status(400).json({ message: "Unsupported audio format detected" });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The validation logic can be simplified by combining the 'unknown' check and the MIME type mapping. This reduces redundancy and makes the code more readable. Using a single check for the mapped MIME type covers both the 'unknown' case and any potential unmapped formats.

Suggested change
const format = detectAudioFormat(file.buffer);
if (format === "unknown") {
return res.status(400).json({ message: "Invalid or unsupported audio format" });
}
// Map format to mime type
const mimeMap: Record<string, string> = {
wav: "audio/wav",
mp3: "audio/mpeg",
webm: "audio/webm",
mp4: "audio/mp4",
ogg: "audio/ogg",
};
const mimeType = mimeMap[format];
// Should not happen if detectAudioFormat only returns the above types + unknown (handled above)
if (!mimeType) {
return res.status(400).json({ message: "Unsupported audio format detected" });
}
const format = detectAudioFormat(file.buffer);
const mimeMap: Record<string, string> = {
wav: "audio/wav",
mp3: "audio/mpeg",
webm: "audio/webm",
mp4: "audio/mp4",
ogg: "audio/ogg",
};
const mimeType = mimeMap[format];
if (!mimeType) {
return res.status(400).json({ message: "Invalid or unsupported audio format" });
}

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds server-side magic-byte validation for the /api/audio/generate-with-reference upload flow so the generated data URL uses a detected container type (rather than trusting the client-provided MIME type), reducing audio upload spoofing risk.

Changes:

  • Import and apply detectAudioFormat(file.buffer) to reject unknown/unsupported uploads.
  • Map detected formats to a server-chosen MIME type for the outbound data: URL.
  • Stop using file.mimetype when constructing the dataUrl.

return res.status(400).json({ message: "Reference audio file is required" });
}

// Sentinel: Validate audio format using magic bytes
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment reintroduces a branded "Sentinel:" prefix, which the codebase has been removing for consistency. Please drop the prefix and keep the comment descriptive (or remove the comment entirely if it’s no longer needed).

Suggested change
// Sentinel: Validate audio format using magic bytes
// Validate audio format using magic bytes

Copilot uses AI. Check for mistakes.
import { registerChatRoutes } from "./replit_integrations/chat";
import { registerImageRoutes } from "./replit_integrations/image";
import { aiRateLimiter, writeRateLimiter } from "./middleware";
import { detectAudioFormat } from "./replit_integrations/audio/client";
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing detectAudioFormat from server/replit_integrations/audio/client.ts pulls in module-level side effects (e.g., OpenAI client initialization) just to validate uploads. Consider moving/re-exporting detectAudioFormat from a side-effect-free utility module so route initialization doesn’t depend on audio integration setup.

Copilot uses AI. Check for mistakes.
Comment on lines +1151 to +1154
// Sentinel: Validate audio format using magic bytes
const format = detectAudioFormat(file.buffer);
if (format === "unknown") {
return res.status(400).json({ message: "Invalid or unsupported audio format" });
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation relies on detectAudioFormat(), but its current WAV detection returns "wav" for any RIFF file (it doesn’t verify the "WAVE" marker). That weakens the intended magic-byte protection. Tighten detectAudioFormat to validate the full WAV signature (RIFF + WAVE) (and similarly ensure other formats are unambiguous).

Copilot uses AI. Check for mistakes.
Comment on lines +1157 to +1165
// Map format to mime type
const mimeMap: Record<string, string> = {
wav: "audio/wav",
mp3: "audio/mpeg",
webm: "audio/webm",
mp4: "audio/mp4",
ogg: "audio/ogg",
};
const mimeType = mimeMap[format];
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new magic-byte allowlist only maps wav/mp3/webm/mp4/ogg, but the multer fileFilter above still allows FLAC and AAC. As a result, FLAC/AAC uploads will now pass the MIME filter but always fail magic-byte validation. Either add FLAC/AAC detection + mimeMap entries, or remove them from the MIME allowlist to keep behavior consistent.

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
server/routes.ts (1)

1158-1170: Record<string, string> makes mimeType always typed as string, turning the defensive !mimeType guard into dead code from TypeScript's perspective.

noUncheckedIndexedAccess is not part of the standard --strict family and must be opted in explicitly. Without it, indexing Record<string, string> always yields string (never string | undefined), so TypeScript considers if (!mimeType) unreachable. The guard is genuinely useful for future extensibility (a new format added to detectAudioFormat but missing from mimeMap), but its intent must be reflected in the type.

♻️ Proposed fix – use `Partial` to make the guard type-accurate
-      const mimeMap: Record<string, string> = {
+      const mimeMap: Partial<Record<string, string>> = {
         wav: "audio/wav",
         mp3: "audio/mpeg",
         webm: "audio/webm",
         mp4: "audio/mp4",
         ogg: "audio/ogg",
       };
       const mimeType = mimeMap[format];
-      // Should not happen if detectAudioFormat only returns the above types + unknown (handled above)
+      // Guards against a new format being added to detectAudioFormat without updating mimeMap
       if (!mimeType) {
         return res.status(400).json({ message: "Unsupported audio format detected" });
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/routes.ts` around lines 1158 - 1170, The mimeMap is currently typed as
Record<string,string>, which makes mimeType always a string and renders the
defensive if (!mimeType) unreachable; change the map's type to reflect possible
absence (e.g., use Partial<Record<string,string>> or a union-keyed Record of the
known formats) so mimeType becomes string | undefined, preserving the runtime
guard, and leave the existing check/error return in place; update the
declaration of mimeMap and the inferred type of mimeType (references: mimeMap,
mimeType, detectAudioFormat).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/routes.ts`:
- Around line 1151-1165: The fileFilter allowlist and detectAudioFormat/mimeMap
are out of sync: add support for FLAC and AAC magic-byte detection in
detectAudioFormat (so it returns "flac" and "aac" instead of "unknown" for valid
uploads) and/or remove those MIME types from the multer fileFilter; also add
"audio/webm" to the fileFilter allowlist so the "webm" branch in mimeMap is
reachable. Update detectAudioFormat (the function that checks magic bytes) to
recognize FLAC/AAC signatures and ensure mimeMap keys (wav, mp3, webm, mp4, ogg,
flac, aac) exactly match the fileFilter MIME strings to avoid the misleading 400
response after multer accepts a file.

---

Nitpick comments:
In `@server/routes.ts`:
- Around line 1158-1170: The mimeMap is currently typed as
Record<string,string>, which makes mimeType always a string and renders the
defensive if (!mimeType) unreachable; change the map's type to reflect possible
absence (e.g., use Partial<Record<string,string>> or a union-keyed Record of the
known formats) so mimeType becomes string | undefined, preserving the runtime
guard, and leave the existing check/error return in place; update the
declaration of mimeMap and the inferred type of mimeType (references: mimeMap,
mimeType, detectAudioFormat).

Comment on lines +1151 to +1165
// Sentinel: Validate audio format using magic bytes
const format = detectAudioFormat(file.buffer);
if (format === "unknown") {
return res.status(400).json({ message: "Invalid or unsupported audio format" });
}

// Map format to mime type
const mimeMap: Record<string, string> = {
wav: "audio/wav",
mp3: "audio/mpeg",
webm: "audio/webm",
mp4: "audio/mp4",
ogg: "audio/ogg",
};
const mimeType = mimeMap[format];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Misalignment between fileFilter allowlist and detectAudioFormat creates a functional regression and a dead code path.

Two concrete problems introduced by this change:

  1. FLAC/AAC regressionaudio/flac and audio/aac are accepted by the pre-existing fileFilter (line 1135), but detectAudioFormat has no magic-byte patterns for either and returns "unknown", causing all valid FLAC/AAC uploads to be rejected with a misleading 400 after passing the multer check.

  2. WebM dead-enddetectAudioFormat and mimeMap both support "webm", but "audio/webm" is absent from the fileFilter allowlist. Legitimate browser-recorded WebM audio (e.g., from MediaRecorder) is rejected by multer before ever reaching the magic-byte check, making the webm branch in mimeMap unreachable via a correctly labeled upload.

The fix is to align the two lists:

🛠️ Proposed fix – sync fileFilter with detectAudioFormat's supported formats
-      const allowed = ["audio/mpeg", "audio/wav", "audio/mp3", "audio/x-wav", "audio/ogg", "audio/flac", "audio/mp4", "audio/aac"];
+      // Allow only MIME types that map to formats detectAudioFormat can verify via magic bytes
+      const allowed = ["audio/mpeg", "audio/wav", "audio/mp3", "audio/x-wav", "audio/ogg", "audio/mp4", "audio/webm", "audio/webm;codecs=opus"];
       if (allowed.includes(file.mimetype)) {
         cb(null, true);
       } else {
-        cb(new Error("Only audio files are allowed (MP3, WAV, OGG, FLAC, AAC)"));
+        cb(new Error("Only audio files are allowed (MP3, WAV, OGG, MP4/M4A, WebM)"));
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/routes.ts` around lines 1151 - 1165, The fileFilter allowlist and
detectAudioFormat/mimeMap are out of sync: add support for FLAC and AAC
magic-byte detection in detectAudioFormat (so it returns "flac" and "aac"
instead of "unknown" for valid uploads) and/or remove those MIME types from the
multer fileFilter; also add "audio/webm" to the fileFilter allowlist so the
"webm" branch in mimeMap is reachable. Update detectAudioFormat (the function
that checks magic bytes) to recognize FLAC/AAC signatures and ensure mimeMap
keys (wav, mp3, webm, mp4, ogg, flac, aac) exactly match the fileFilter MIME
strings to avoid the misleading 400 response after multer accepts a file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant