-
Notifications
You must be signed in to change notification settings - Fork 987
fix: prevent response file overwrite when -sd flag is used #2226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
WalkthroughAdds explicit duplicate detection and counting for input targets, propagates duplicates when deduplication is disabled, refactors per-target processing into a repeatable runner that executes per-target count times, changes response-saving to exclusive-create with incremental suffixes to avoid overwrites, and updates tests to assert the new duplicate error behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Runner
participant FS as FileSystem
Note over Runner: Input parsing & counting
User->>Runner: supply targets (may include duplicates) + options
Runner->>Runner: countTargetFromRawTarget(raw)
alt target exists
Runner-->>Runner: returns duplicateTargetErr
alt SkipDedupe enabled
Runner->>Runner: increment stored count for target
else
Runner->>Runner: ignore duplicate (no count increase)
end
else new target
Runner->>Runner: store target with count=1
end
Note over Runner,FS: Processing phase (per-target repeats)
loop i = 1..count
Runner->>Runner: runProcess(1) — perform enumeration for target
Runner->>FS: attempt create response file (O_EXCL)
alt creation collision
FS->>FS: compute filename with suffix (_1, _2, ...)
FS->>Runner: return newly created file
else success
FS->>Runner: file created
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
runner/runner.go (1)
2152-2176: Consider adding a safety limit to prevent potential infinite loops.While the file creation logic correctly handles duplicates, there's no upper limit on the suffix counter. In edge cases or under concurrent load, this could theoretically loop indefinitely.
Consider adding a reasonable upper limit:
finalPath := responsePath idx := 0 +const maxSuffixAttempts = 1000 for { + if idx >= maxSuffixAttempts { + gologger.Error().Msgf("Exceeded maximum attempts to create unique file for '%s'", responsePath) + break + } targetPath := finalPath if idx > 0 { basePath := strings.TrimSuffix(responsePath, ".txt") targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
runner/runner.go(6 hunks)runner/runner_test.go(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
runner/runner_test.go (1)
runner/options.go (1)
Options(173-351)
🔇 Additional comments (5)
runner/runner.go (4)
448-449: LGTM! Good use of a sentinel error for explicit duplicate detection.The error declaration follows Go's best practices for sentinel errors.
642-645: Good improvement to error handling!Making duplicate detection explicit with an error return is better than silently returning 0. This allows callers to handle duplicates appropriately based on their needs.
1245-1267: Well-structured refactoring for repeated target processing!The
runProcesshelper function cleanly encapsulates the logic for processing targets multiple times based on their count. Good handling of edge cases with the default to 1 when parsing fails.
455-465: Consider initializing counts consistently.The current implementation always sets the initial count to "1" for new targets (line 458), but this might not accurately reflect the actual number of times a target appears in the input. Additionally, the error from
strconv.Atoiat line 461 is ignored, which could lead to unexpected behavior if the stored value is corrupted.Consider this more robust implementation:
- r.hm.Set(target, []byte("1")) //nolint + r.hm.Set(target, []byte("1")) } else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { if v, ok := r.hm.Get(target); ok { - cnt, _ := strconv.Atoi(string(v)) + cnt, err := strconv.Atoi(string(v)) + if err != nil { + // Handle corrupted count, default to 1 + cnt = 1 + } r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) numHosts += 1 }Likely an incorrect or invalid review comment.
runner/runner_test.go (1)
10-10: Test updates correctly reflect the new duplicate handling behavior!Good changes:
- Explicitly setting
SkipDedupe: falsemakes the test's intent clear- Using
errors.Isis the correct way to check for sentinel errors in Go- The test properly validates that duplicates are detected and return the expected error
Also applies to: 128-130, 145-145
| expandedTarget, err := r.countTargetFromRawTarget(target) | ||
| if err == nil && expandedTarget > 0 { | ||
| numTargets += expandedTarget | ||
| r.hm.Set(target, nil) //nolint | ||
| r.hm.Set(target, []byte("1")) //nolint | ||
| } else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { | ||
| if v, ok := r.hm.Get(target); ok { | ||
| cnt, _ := strconv.Atoi(string(v)) | ||
| r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) | ||
| numTargets += 1 | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Extract duplicate counting logic to reduce code duplication.
This logic is duplicated from prepareInput. Consider extracting it into a helper method to maintain DRY principles and ensure consistent behavior.
Consider creating a helper method:
func (r *Runner) incrementTargetCount(target string) {
if v, ok := r.hm.Get(target); ok {
cnt, err := strconv.Atoi(string(v))
if err != nil {
cnt = 1
}
r.hm.Set(target, []byte(strconv.Itoa(cnt+1)))
}
}Then use it in both locations:
} else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) {
- if v, ok := r.hm.Get(target); ok {
- cnt, _ := strconv.Atoi(string(v))
- r.hm.Set(target, []byte(strconv.Itoa(cnt+1)))
- numTargets += 1
- }
+ r.incrementTargetCount(target)
+ numTargets += 1
}🤖 Prompt for AI Agents
In runner/runner.go around lines 622 to 632, the logic for incrementing the
count of duplicate targets is duplicated from the prepareInput method. Extract
this duplicate counting logic into a new helper method on the Runner struct, for
example incrementTargetCount, which safely retrieves the current count, handles
conversion errors by defaulting to 1, increments the count, and updates the map.
Replace the duplicated code in both places with calls to this new helper method
to adhere to DRY principles and maintain consistent behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
runner/runner.go (2)
658-663: Fix error handling in duplicate count increment.Same issue as in
prepareInput(lines 492-495): the error fromstrconv.Atoiis ignored, which can cause the count to reset to 1 instead of incrementing.Apply this diff:
} else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { if v, ok := r.hm.Get(target); ok { - cnt, _ := strconv.Atoi(string(v)) + cnt, err := strconv.Atoi(string(v)) + if err != nil { + cnt = 1 + } r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) numTargets += 1 } }
492-495: Fix error handling in duplicate count increment.The error from
strconv.Atoiis silently ignored. If the stored value is corrupted (not a valid integer),cntwill be 0, and storingcnt+1(which is 1) will reset the count instead of incrementing it. This causes undercounting when duplicates are encountered.Apply this diff:
} else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { if v, ok := r.hm.Get(target); ok { - cnt, _ := strconv.Atoi(string(v)) + cnt, err := strconv.Atoi(string(v)) + if err != nil { + cnt = 1 + } r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) numHosts += 1 } }
🧹 Nitpick comments (1)
runner/runner.go (1)
2235-2258: Add upper bound to file suffix retry loop.The loop has no maximum iteration limit. While unlikely in practice, if many duplicate responses accumulate or if there's a bug, this could iterate excessively. Consider adding a reasonable upper bound (e.g., 1000 or 10000 attempts) and logging a warning if the limit is reached.
Apply this diff:
finalPath := responsePath idx := 0 +const maxSuffixAttempts = 1000 for { + if idx >= maxSuffixAttempts { + gologger.Error().Msgf("Exceeded maximum suffix attempts (%d) for response file '%s'", maxSuffixAttempts, responsePath) + break + } targetPath := finalPath if idx > 0 { basePath := strings.TrimSuffix(responsePath, ".txt") targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx) } f, err := os.OpenFile(targetPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0644) if err == nil { _, writeErr := f.Write(data) f.Close() if writeErr != nil { gologger.Error().Msgf("Could not write to '%s': %s", targetPath, writeErr) } break } if !os.IsExist(err) { gologger.Error().Msgf("Failed to create file '%s': %s", targetPath, err) break } idx++ }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
runner/runner.go(5 hunks)
🔇 Additional comments (1)
runner/runner.go (1)
1316-1338: Clean implementation of per-target iteration.The nested
runProcessfunction effectively handles repeated processing based on the duplicate count. The error handling when retrieving the count from the map (line 1334) is correct and defensive.
Mzack9999
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Follow ups:
- Index file generation always point to the same original sha1 filename:
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)
/Users/user/go/src/github.com/projectdiscovery/httpx/cmd/httpx/output/response/localhost_8000/59bd7616010ed02cd66f44e94e9368776966fe3b.txt http://localhost:8000 (200 OK)|
Should the entries in the index file match the files under the Currently, it looks like a new index file is created for every request. Round 1 Round 2 |
Summary
This PR addresses issue #2152, which causes response files to be overwritten even when the
-sd(SkipDedupe) flag is used.What was the issue?
When the
-sdflag is enabled, input targets are processed without deduplication. However, the responses are still written to the same file path based on the SHA1 hash of the URL, leading to overwritten output files.What’s changed?
_1,_2, ...).HybridMap, and the number of processing iterations is determined based on this count.countTargetFromRawTargetto return a knownduplicateTargetErrwhen deduplication is disabled.analyzeandprocessto ensure unique file writes under concurrency.Why is this useful?
This change ensures:
-sdflag.Result
Related issue
Closes #2152
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests