Skip to content

Commit cad6790

Browse files
committed
feat: add a file discovery package
1 parent 6dfcfee commit cad6790

File tree

3 files changed

+846
-0
lines changed

3 files changed

+846
-0
lines changed

pkg/ecosystems/discovery/README.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# File Discovery Package
2+
3+
Efficient file discovery utilities for finding manifest and configuration files in directory trees.
4+
5+
## Features
6+
7+
- **Multiple Target Files**: Find specific files by path
8+
- **Multiple Glob Patterns**: Find files matching any of multiple patterns (e.g., `*.py`, `*.toml`)
9+
- **Flexible Combination**: Use both target files and glob patterns together
10+
- **Automatic Deduplication**: Returns unique results when multiple criteria match the same file
11+
- **Exclude Patterns**: Skip directories and files using glob patterns
12+
- **Context Support**: Cancellable operations for long-running searches
13+
- **Efficient Traversal**: Uses `filepath.WalkDir` for optimal performance
14+
- **Symlink Handling**: Optional symlink following
15+
- **Structured Logging**: `slog` integration for debugging
16+
17+
## Usage
18+
19+
The package uses the functional options pattern for clean and idiomatic configuration.
20+
21+
### Find a Specific File
22+
23+
```go
24+
results, err := discovery.FindFiles(ctx, "/path/to/project",
25+
discovery.WithTargetFile("requirements.txt"))
26+
```
27+
28+
### Find Multiple Specific Files
29+
30+
```go
31+
// Multiple individual options
32+
results, err := discovery.FindFiles(ctx, "/path/to/project",
33+
discovery.WithTargetFile("requirements.txt"),
34+
discovery.WithTargetFile("setup.py"),
35+
discovery.WithTargetFile("pyproject.toml"))
36+
37+
// Or use variadic form
38+
results, err := discovery.FindFiles(ctx, "/path/to/project",
39+
discovery.WithTargetFiles("requirements.txt", "setup.py", "pyproject.toml"))
40+
```
41+
42+
### Find Files Matching Pattern
43+
44+
```go
45+
results, err := discovery.FindFiles(ctx, "/path/to/project",
46+
discovery.WithInclude("requirements*.txt"))
47+
```
48+
49+
### Find Files Matching Multiple Patterns
50+
51+
```go
52+
// Multiple individual patterns
53+
results, err := discovery.FindFiles(ctx, "/path/to/project",
54+
discovery.WithInclude("*.py"),
55+
discovery.WithInclude("*.toml"),
56+
discovery.WithInclude("*.yml"))
57+
58+
// Or use variadic form
59+
results, err := discovery.FindFiles(ctx, "/path/to/project",
60+
discovery.WithIncludes("*.py", "*.toml", "*.yml"))
61+
```
62+
63+
### Combine Target Files and Globs
64+
65+
```go
66+
// Find specific files AND all files matching patterns
67+
results, err := discovery.FindFiles(ctx, "/path/to/project",
68+
discovery.WithTargetFile("requirements.txt"),
69+
discovery.WithInclude("*.py"),
70+
discovery.WithInclude("*.toml"))
71+
// Returns: requirements.txt + all .py files + all .toml files (deduplicated)
72+
```
73+
74+
### Exclude Patterns
75+
76+
```go
77+
// Single exclude pattern
78+
results, err := discovery.FindFiles(ctx, "/path/to/project",
79+
discovery.WithInclude("requirements.txt"),
80+
discovery.WithExclude("node_modules")) // Excludes node_modules directory
81+
```
82+
83+
### Multiple Exclude Patterns
84+
85+
```go
86+
// Multiple individual exclude patterns
87+
results, err := discovery.FindFiles(ctx, "/path/to/project",
88+
discovery.WithInclude("*.py"),
89+
discovery.WithExclude("node_modules"),
90+
discovery.WithExclude(".*"), // Exclude hidden directories
91+
discovery.WithExclude("__pycache__"))
92+
93+
// Or use variadic form
94+
results, err := discovery.FindFiles(ctx, "/path/to/project",
95+
discovery.WithInclude("*.py"),
96+
discovery.WithExcludes("node_modules", ".*", "__pycache__"))
97+
```
98+
99+
### Follow Symlinks
100+
101+
```go
102+
results, err := discovery.FindFiles(ctx, "/path/to/project",
103+
discovery.WithInclude("*.txt"),
104+
discovery.WithFollowSymlinks(true))
105+
```
106+
107+
### Common Exclude Patterns
108+
109+
```go
110+
// Exclude hidden directories
111+
WithExclude(".*")
112+
113+
// Exclude specific directory
114+
WithExclude("node_modules")
115+
116+
// Exclude file type
117+
WithExclude("*.tmp")
118+
119+
// Exclude multiple patterns at once
120+
WithExcludes("node_modules", ".*", "__pycache__", "*.pyc")
121+
```
122+
123+
## Performance
124+
125+
- Uses `filepath.WalkDir` instead of `filepath.Walk` for better performance
126+
- Skips entire directory trees when excluded
127+
- Minimal allocations for large directory structures
128+
- Context cancellation for early termination
129+
130+
## Return Value
131+
132+
Returns `[]FindResult` where each result contains:
133+
- `Path`: Absolute path to the file
134+
- `RelPath`: Relative path from the root directory
135+
136+
## Error Handling
137+
138+
- Returns error for invalid patterns or inaccessible root directory
139+
- Logs warnings for inaccessible files/directories but continues walking
140+
- Returns `context.Canceled` if operation is cancelled
141+
142+
## Examples
143+
144+
### Find Python Manifest Files
145+
146+
```go
147+
ctx := context.Background()
148+
149+
// Find all Python manifest files
150+
results, err := discovery.FindFiles(ctx, projectDir,
151+
discovery.WithTargetFile("requirements.txt"), // Exact file
152+
discovery.WithIncludes("requirements*.txt", "*.toml", "setup.py"), // Patterns
153+
discovery.WithExcludes(".venv", "__pycache__", "*.pyc")) // Exclude virtual env and build artifacts
154+
155+
if err != nil {
156+
return err
157+
}
158+
159+
for _, result := range results {
160+
fmt.Printf("Found: %s (at %s)\n", result.RelPath, result.Path)
161+
}
162+
```
163+
164+
### Find Configuration Files Across Ecosystem
165+
166+
```go
167+
// Find manifest files for multiple package managers
168+
results, err := discovery.FindFiles(ctx, projectDir,
169+
discovery.WithTargetFiles("package.json", "go.mod", "Gemfile", "pom.xml"),
170+
discovery.WithIncludes("*.csproj", "*.gradle", "*.toml"))
171+
```

0 commit comments

Comments
 (0)