Skip to content

[K9VULN-12862] Skip crawling of non-UTF-8 paths#868

Merged
jasonforal merged 4 commits intomainfrom
jf/K9VULN-12862
Mar 31, 2026
Merged

[K9VULN-12862] Skip crawling of non-UTF-8 paths#868
jasonforal merged 4 commits intomainfrom
jf/K9VULN-12862

Conversation

@jasonforal
Copy link
Copy Markdown
Collaborator

@jasonforal jasonforal commented Mar 30, 2026

What problem are you trying to solve?

We currently panic when encountering a path with non-UTF-8 characters while crawling the directory for files to scan.

What is your solution?

  • Skip non-UTF-8 paths
  • In case of some other error -- return an exit instead of panicking.

Alternatives considered

What the reviewer should know

  • This behaves consistent with how we drop lines containing non-UTF-8 paths from the .gitignore
  • I updated a diff aware function to use &Path instead of &str (i.e. don't try to convert to UTF-8), however a non-UTF-8 PathBuf is impossible because the data is populated from a JSON response from the backend.

Copilot AI review requested due to automatic review settings March 30, 2026 22:10
@jasonforal jasonforal requested a review from a team as a code owner March 30, 2026 22:10
@datadog-datadog-prod-us1
Copy link
Copy Markdown

datadog-datadog-prod-us1 bot commented Mar 30, 2026

🎯 Code Coverage (details)
Patch Coverage: 90.48%
Overall Coverage: 85.01% (-0.00%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 47a9544 | Docs | Datadog PR Page | Was this helpful? React with 👍/👎 or give us feedback!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a panic that occurs when encountering paths with non-UTF-8 characters during directory crawling. The solution gracefully skips such paths and uses proper error handling patterns (context with the ? operator) instead of expect() or unwrap().

Changes:

  • Modified get_files() to skip non-UTF-8 paths using pattern matching instead of returning errors
  • Modified filter_files_by_diff_aware_info() to handle path stripping failures gracefully by returning false instead of panicking
  • Updated error handling in the binary to use .context()?. for proper error propagation

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

File Description
crates/cli/src/file_utils.rs Changed error handling to skip non-UTF-8 paths in get_files() and changed path comparison in filter_files_by_diff_aware_info() from string-based to Path-based
crates/bins/src/bin/datadog-static-analyzer.rs Updated error handling from .expect() to .context()?. for better error propagation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jasonforal jasonforal merged commit e05d44d into main Mar 31, 2026
90 checks passed
@jasonforal jasonforal deleted the jf/K9VULN-12862 branch March 31, 2026 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants