Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
542ae20
initial gitbucket implementation with PAT auth
dhirenmathur Oct 21, 2025
cc3fb9f
Gitbucket support
dhirenmathur Oct 27, 2025
21aad2f
Merge cc3fb9ff8b38b9719a090b1e44c6674bbac07157 into 5e3aca3795670d565…
dhirenmathur Oct 27, 2025
6c2294a
chore: Auto-fix pre-commit issues
github-actions[bot] Oct 27, 2025
e295fe6
Remove .codex from repo
dhirenmathur Oct 27, 2025
5cf047e
remove gitbucket webhook and unnecessary docs
dhirenmathur Oct 27, 2025
6cf0796
Merge branch 'gitbucket' of https://github.com/potpie-ai/potpie into …
dhirenmathur Oct 27, 2025
b4cec92
Merge 6cf0796c003e96f7fa0fbaa9b25919838a811400 into 5e3aca3795670d565…
dhirenmathur Oct 27, 2025
b7423a9
chore: Auto-fix pre-commit issues
github-actions[bot] Oct 27, 2025
dcc290f
fix reparse issue
dhirenmathur Oct 27, 2025
6f80d75
Fix gitbucket compatibility in tools
dhirenmathur Oct 27, 2025
81433ca
Merge branch 'gitbucket' of https://github.com/potpie-ai/potpie into …
dhirenmathur Oct 27, 2025
07a9469
Merge 81433cafe9f01fb30667a69ba0de119ad23968a1 into 0dea48e065bd15c5a…
dhirenmathur Oct 27, 2025
af65c7e
chore: Auto-fix pre-commit issues
github-actions[bot] Oct 27, 2025
84f9e90
Add local provider support and harden code tooling
dhirenmathur Oct 28, 2025
60b924b
Merge branch 'main' into gitbucket
dhirenmathur Oct 28, 2025
3000fc6
Merge 60b924bdc2aa4db6b7f1dfe5c7a5cdba06f901e7 into 56adc4f76ef49999b…
dhirenmathur Oct 28, 2025
e686369
chore: Auto-fix pre-commit issues
github-actions[bot] Oct 28, 2025
0827525
restore github access
dhirenmathur Oct 31, 2025
6bf07c3
Merge 0827525d43b91b6b6797cef98f376d4017dfa119 into 56adc4f76ef49999b…
dhirenmathur Oct 31, 2025
c750e2f
chore: Auto-fix pre-commit issues
github-actions[bot] Oct 31, 2025
0c898e7
update exteption handling
dhirenmathur Oct 31, 2025
defd6a3
update github auth
dhirenmathur Nov 3, 2025
d24afdf
Merge branch 'gitbucket' of https://github.com/potpie-ai/potpie into …
dhirenmathur Nov 4, 2025
8ba63e0
Merge d24afdf53739d480ab6e0545be5a96ead46673b5 into 56adc4f76ef49999b…
dhirenmathur Nov 4, 2025
1eda9a0
chore: Auto-fix pre-commit issues
github-actions[bot] Nov 4, 2025
3debd92
fix: make GitHub authentication production-ready
dhirenmathur Nov 5, 2025
f89b0a1
Update readme
dhirenmathur Nov 5, 2025
9d1611d
Merge branch 'gitbucket' of https://github.com/potpie-ai/potpie into …
dhirenmathur Nov 5, 2025
6432898
Merge 9d1611d7d7df3c50b1028eb257a9ed0e40b75473 into 56adc4f76ef49999b…
dhirenmathur Nov 5, 2025
8cdd3a5
chore: Auto-fix pre-commit issues
github-actions[bot] Nov 5, 2025
a2ed19e
feat: add grep tool
nndn3 Nov 7, 2025
7bd8053
feat: update jenkins pipeline to support arbitrary branch
nndn3 Nov 11, 2025
22d2833
feat: use gvisor in dockerfile
nndn3 Nov 11, 2025
1091ad1
fix: staging jenkins
nndn3 Nov 11, 2025
356e3e5
fix: repo manager build
nndn3 Nov 12, 2025
0bf6485
Merge branch 'main' into feat/grep
nndn3 Nov 12, 2025
385ce59
Merge 0bf6485a9f3626e28c73e0e9801c6ab9d0005563 into f200b371f0f0cc64d…
nndn Nov 12, 2025
c6c09b3
chore: Auto-fix pre-commit issues
github-actions[bot] Nov 12, 2025
a8ca0a3
feat: fix parsing helper
nndn3 Nov 12, 2025
7c67f5e
fix: review comments
nndn Nov 17, 2025
309f77b
Merge 7c67f5ebe4d41aaa84d18fd233e89a74cb975437 into c9a2acb0cd7f09663…
nndn Nov 17, 2025
fc5d140
chore: Auto-fix pre-commit issues
github-actions[bot] Nov 17, 2025
7781a3c
Merge branch 'main' into feat/grep
nndn Nov 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 8 additions & 4 deletions .env.template
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,8 @@ FIREBASE_SERVICE_ACCOUNT=
KNOWLEDGE_GRAPH_URL=
GITHUB_APP_ID=
GITHUB_PRIVATE_KEY=
GH_TOKEN_LIST= # Comma-separated GitHub PAT tokens for github.com (e.g., ghp_token1,ghp_token2)
# Comma-separated GitHub PAT tokens for github.com (e.g., ghp_token1,ghp_token2)
GH_TOKEN_LIST=
TRANSACTION_EMAILS_ENABLED=
EMAIL_FROM_ADDRESS=
RESEND_API_KEY=
Expand All @@ -86,9 +87,12 @@ PHOENIX_PROJECT_NAME=potpie-ai

# Optional: Git provider configuration for self-hosted instances
# Supported providers: github, gitbucket, gitlab, bitbucket, local
CODE_PROVIDER=github # Options: github, gitlab, gitbucket, local
CODE_PROVIDER_BASE_URL= # e.g., http://localhost:8080/api/v3 for GitBucket, /path/to/repo for local
CODE_PROVIDER_TOKEN= # PAT for self-hosted Git server (not needed for local)
# Options: github, gitlab, gitbucket, local
CODE_PROVIDER=github
# e.g., http://localhost:8080/api/v3 for GitBucket, /path/to/repo for local
CODE_PROVIDER_BASE_URL=
# PAT for self-hosted Git server (not needed for local)
CODE_PROVIDER_TOKEN=

# For local provider:
# CODE_PROVIDER=local
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,4 @@ package-lock.json
thoughts/
.codex/
worktrees/
.repos/
51 changes: 26 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,44 +197,45 @@ Potpie provides a set of tools that agents can use to interact with the knowledg
**`INFERENCE_MODEL`** and **`CHAT_MODEL`** correspond to the models that will be used for generating knowledge graph and for agent reasoning respectively. These model names should be in the format of `provider/model_name` format or as expected by Litellm. For more information, refer to the [Litellm documentation](https://docs.litellm.ai/docs/providers).
<br>

#### GitHub Authentication Setup
#### GitHub Authentication Setup

Potpie supports multiple authentication methods for accessing GitHub repositories:
Potpie supports multiple authentication methods for accessing GitHub repositories:

##### For GitHub.com Repositories:
##### For GitHub.com Repositories:

**Option 1: GitHub App (Recommended for Production)**
- Create a GitHub App in your organization
- Set environment variables:
```bash
GITHUB_APP_ID=your-app-id
GITHUB_PRIVATE_KEY=your-private-key
```
**Option 1: GitHub App (Recommended for Production)**
- Create a GitHub App in your organization
- Set environment variables:
```bash
GITHUB_APP_ID=your-app-id
GITHUB_PRIVATE_KEY=your-private-key
```

**Option 2: Personal Access Token (PAT) Pool**
- Create one or more GitHub PATs with `repo` scope
- Set environment variable (comma-separated for multiple tokens):
```bash
GH_TOKEN_LIST=ghp_token1,ghp_token2,ghp_token3
```
- Potpie will randomly select from the pool for load balancing
- **Rate Limit**: 5,000 requests/hour per token (authenticated)
**Option 2: Personal Access Token (PAT) Pool**
- Create one or more GitHub PATs with `repo` scope
- Set environment variable (comma-separated for multiple tokens):
```bash
GH_TOKEN_LIST=ghp_token1,ghp_token2,ghp_token3
```
- Potpie will randomly select from the pool for load balancing
- **Rate Limit**: 5,000 requests/hour per token (authenticated)

**Option 3: Unauthenticated Access (Public Repos Only)**
- No configuration needed
- Automatically used as fallback for public repositories
- **Rate Limit**: 60 requests/hour per IP (very limited)
**Option 3: Unauthenticated Access (Public Repos Only)**
- No configuration needed
- Automatically used as fallback for public repositories
- **Rate Limit**: 60 requests/hour per IP (very limited)

##### For Self-Hosted Git Servers (GitBucket, GitLab, etc.):
##### For Self-Hosted Git Servers (GitBucket, GitLab, etc.):

Set the following environment variables:
```bash
CODE_PROVIDER=github # or gitlab
# Options: github, gitlab, gitbucket
CODE_PROVIDER=github
CODE_PROVIDER_BASE_URL=http://your-git-server.com/api/v3
CODE_PROVIDER_TOKEN=your-token
```

**Important**: `GH_TOKEN_LIST` tokens are always used for GitHub.com, regardless of `CODE_PROVIDER_BASE_URL`.
**Important**: `GH_TOKEN_LIST` tokens are always used for GitHub.com, regardless of `CODE_PROVIDER_BASE_URL`.

- Create a Virtual Environment using Python 3.10:
```
Expand Down
95 changes: 47 additions & 48 deletions app/modules/code_provider/gitbucket/gitbucket_provider.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import logging
from typing import List, Dict, Any, Optional
from typing import Any, Dict, List, Optional, Set
import chardet
from github import Github
from github.GithubException import GithubException
Expand Down Expand Up @@ -96,13 +96,32 @@ def _ensure_authenticated(self):
if not self.client:
raise RuntimeError("Provider not authenticated. Call authenticate() first.")

def _get_repo(self, repo_name: str):
"""
Get repository object with normalized repo name conversion.

Converts normalized repo name (e.g., 'user/repo') back to GitBucket's
actual identifier format (e.g., 'root/repo') for API calls.

Args:
repo_name: Normalized repository name

Returns:
Repository object from PyGithub
"""
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
)

actual_repo_name = get_actual_repo_name_for_lookup(repo_name, "gitbucket")
return self.client.get_repo(actual_repo_name)

# ============ Repository Operations ============

def get_repository(self, repo_name: str) -> Dict[str, Any]:
"""Get repository details."""
self._ensure_authenticated()

# Convert normalized repo name back to GitBucket format for API calls
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
normalize_repo_name,
Expand All @@ -114,7 +133,7 @@ def get_repository(self, repo_name: str) -> Dict[str, Any]:
f"GitBucket: Attempting to get repository '{repo_name}' (actual: '{actual_repo_name}')"
)
try:
repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)
logger.info(
f"GitBucket: Successfully retrieved repository '{repo_name}' - ID: {repo.id}, Default branch: {repo.default_branch}"
)
Expand Down Expand Up @@ -183,14 +202,7 @@ def get_file_content(
"""Get file content."""
self._ensure_authenticated()

# Convert normalized repo name back to GitBucket format for API calls
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
)

actual_repo_name = get_actual_repo_name_for_lookup(repo_name, "gitbucket")

repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)
file_contents = repo.get_contents(file_path, ref=ref)

# Decode content
Expand Down Expand Up @@ -223,23 +235,14 @@ def get_repository_structure(
"""Get repository structure recursively."""
self._ensure_authenticated()

# Convert normalized repo name back to GitBucket format for API calls
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
)

actual_repo_name = get_actual_repo_name_for_lookup(repo_name, "gitbucket")

try:
repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)
except GithubException as e:
logger.error(
f"GitBucket: Failed to get repository '{actual_repo_name}': {e}"
)
logger.error(f"GitBucket: Failed to get repository '{repo_name}': {e}")
raise
except Exception as e:
logger.error(
f"GitBucket: Unexpected error getting repository '{actual_repo_name}': {e}"
f"GitBucket: Unexpected error getting repository '{repo_name}': {e}"
)
raise

Expand Down Expand Up @@ -449,14 +452,7 @@ def list_branches(self, repo_name: str) -> List[str]:
"""List branches."""
self._ensure_authenticated()

# Convert normalized repo name back to GitBucket format for API calls
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
)

actual_repo_name = get_actual_repo_name_for_lookup(repo_name, "gitbucket")

repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)
branches = [branch.name for branch in repo.get_branches()]

# Put default branch first
Expand All @@ -471,7 +467,6 @@ def get_branch(self, repo_name: str, branch_name: str) -> Dict[str, Any]:
"""Get branch details."""
self._ensure_authenticated()

# Convert normalized repo name back to GitBucket format for API calls
from app.modules.parsing.utils.repo_name_normalizer import (
get_actual_repo_name_for_lookup,
)
Expand All @@ -482,7 +477,7 @@ def get_branch(self, repo_name: str, branch_name: str) -> Dict[str, Any]:
f"GitBucket: Getting branch '{branch_name}' for repository '{repo_name}' (actual: '{actual_repo_name}')"
)
try:
repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)
branch = repo.get_branch(branch_name)

branch_data = {
Expand Down Expand Up @@ -521,7 +516,7 @@ def create_branch(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)

# Get base branch ref
base_ref = repo.get_git_ref(f"heads/{base_branch}")
Expand Down Expand Up @@ -575,20 +570,24 @@ def compare_branches(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)

# Get commits on the head branch
logging.info(f"[GITBUCKET] Getting commits for branch: {head_branch}")
head_commits = repo.get_commits(sha=head_branch)

max_commits = 50 # Safety limit

# Get commits on the base branch for comparison
base_commits = list(repo.get_commits(sha=base_branch))
base_commit_shas = {c.sha for c in base_commits}
base_commit_shas: Set[str] = set()
for idx, base_commit in enumerate(repo.get_commits(sha=base_branch)):
base_commit_shas.add(base_commit.sha)
if idx + 1 >= max_commits:
break

# Track files and their patches
files_dict = {}
commit_count = 0
max_commits = 50 # Safety limit

# Iterate through head branch commits until we find common ancestor
for commit in head_commits:
Expand Down Expand Up @@ -650,7 +649,7 @@ def list_pull_requests(
"""List pull requests."""
self._ensure_authenticated()

repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
pulls = repo.get_pulls(state=state)[:limit]

return [
Expand All @@ -674,7 +673,7 @@ def get_pull_request(
"""Get pull request details."""
self._ensure_authenticated()

repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
pr = repo.get_pull(pr_number)

result = {
Expand Down Expand Up @@ -719,7 +718,7 @@ def create_pull_request(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)

# Validate branches exist
try:
Expand Down Expand Up @@ -783,7 +782,7 @@ def add_pull_request_comment(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
pr = repo.get_pull(pr_number)

if path and line:
Expand Down Expand Up @@ -819,7 +818,7 @@ def create_pull_request_review(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
pr = repo.get_pull(pr_number)

commits = list(pr.get_commits())
Expand Down Expand Up @@ -856,7 +855,7 @@ def list_issues(
"""List issues."""
self._ensure_authenticated()

repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
issues = repo.get_issues(state=state)[:limit]

return [
Expand All @@ -876,7 +875,7 @@ def get_issue(self, repo_name: str, issue_number: int) -> Dict[str, Any]:
"""Get issue details."""
self._ensure_authenticated()

repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
issue = repo.get_issue(issue_number)

return {
Expand All @@ -897,7 +896,7 @@ def create_issue(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)
issue = repo.create_issue(title=title, body=body, labels=labels or [])

return {
Expand Down Expand Up @@ -928,7 +927,7 @@ def create_or_update_file(
self._ensure_authenticated()

try:
repo = self.client.get_repo(repo_name)
repo = self._get_repo(repo_name)

# Check if file exists
try:
Expand Down Expand Up @@ -1057,7 +1056,7 @@ def get_archive_link(self, repo_name: str, format_type: str, ref: str) -> str:
)

try:
repo = self.client.get_repo(actual_repo_name)
repo = self._get_repo(repo_name)

# GitBucket uses a different URL format than GitHub API
# The correct format is: http://hostname/owner/repo/archive/ref.format
Expand Down
4 changes: 2 additions & 2 deletions app/modules/code_provider/local_repo/local_provider.py
Original file line number Diff line number Diff line change
Expand Up @@ -299,8 +299,8 @@ def _traverse_directory(
return []

for entry in sorted(entries):
# Skip hidden files
if entry.startswith("."):
# Skip .git directory only
if entry == ".git":
continue

entry_path = os.path.join(dir_path, entry)
Expand Down
4 changes: 2 additions & 2 deletions app/modules/code_provider/local_repo/local_repo_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,8 +225,8 @@ async def _fetch_repo_structure_async(
# Filter out files with excluded extensions, hidden files/folders, and gitignore matches
filtered_contents = []
for item in contents:
# Skip hidden files and directories (starting with .)
if item["name"].startswith(".") and item["name"] != ".gitignore":
# Skip .git directory only
if item["name"] == ".git":
Comment on lines +228 to +229
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Critical security risk: sensitive dotfiles may be exposed.

The change from excluding all hidden files to only excluding .git could expose sensitive dotfiles containing secrets, credentials, and API keys. Files like .env, .env.local, .env.production, .npmrc, .pypirc, .aws/credentials, and similar configuration files are commonly present but not always gitignored.

Relying solely on gitignore (lines 239-247) is insufficient because:

  • Not all projects maintain comprehensive gitignore files
  • Developers may forget to gitignore new secret files
  • Some local configuration files are intentionally not gitignored for development purposes

Apply this diff to add explicit exclusions for common secret-containing files:

-                # Skip .git directory only
-                if item["name"] == ".git":
+                # Skip .git and common secret-containing dotfiles/directories
+                sensitive_patterns = {
+                    ".git", ".env", ".env.local", ".env.production", ".env.development",
+                    ".npmrc", ".pypirc", ".aws", ".ssh", ".gnupg", ".docker",
+                    ".kube", ".netrc", ".credentials"
+                }
+                if item["name"] in sensitive_patterns or (
+                    item["name"].startswith(".env.") or item["name"].startswith(".secret")
+                ):
                     continue

Run the following script to verify if there are other security safeguards in the codebase that prevent exposing sensitive files:


🏁 Script executed:

#!/bin/bash
# Description: Check for security measures around file access and common secret file patterns

echo "=== Searching for access control or security checks in code provider modules ==="
rg -n --type=py -C3 'class.*Provider|def.*get.*structure|def.*get.*content' app/modules/code_provider/

echo -e "\n=== Checking if sensitive file patterns are handled elsewhere ==="
rg -n --type=py '\.env|\.aws|\.ssh|secret|credential|api[_-]?key' app/modules/code_provider/

echo -e "\n=== Looking for authentication/authorization checks ==="
rg -n --type=py -C2 'auth|permission|access.*control' app/modules/code_provider/

Length of output: 115142



---

🏁 Script executed:

```shell
#!/bin/bash
# Verify the actual code at lines 228-229 and surrounding context
echo "=== Fetching local_repo_service.py lines 220-250 ==="
sed -n '220,250p' app/modules/code_provider/local_repo/local_repo_service.py

echo -e "\n=== Checking gitignore filtering logic (lines 235-250) ==="
sed -n '235,260p' app/modules/code_provider/local_repo/local_repo_service.py

echo -e "\n=== Checking if get_file_content has access controls ==="
rg -n --type=py -A 10 'def get_file_content' app/modules/code_provider/local_repo/local_repo_service.py | head -20

echo -e "\n=== Searching for path validation or security checks in local_repo ==="
rg -n --type=py -C 3 'path.*validat|security|allowed|forbidden|restrict' app/modules/code_provider/local_repo/

Length of output: 623


🏁 Script executed:

#!/bin/bash
# Verify the actual code at lines 228-229 and surrounding context in local_repo_service.py
echo "=== Fetching local_repo_service.py lines 220-250 ==="
sed -n '220,250p' app/modules/code_provider/local_repo/local_repo_service.py

echo -e "\n=== Checking gitignore filtering logic (lines 235-270) ==="
sed -n '235,270p' app/modules/code_provider/local_repo/local_repo_service.py

echo -e "\n=== Checking if get_file_content has path validation or security checks ==="
rg -n --type=py -A 15 'def get_file_content' app/modules/code_provider/local_repo/local_repo_service.py

echo -e "\n=== Searching for any path validation or allowed/forbidden checks ==="
rg -n --type=py 'path.*validat|allowed|forbidden|restrict' app/modules/code_provider/local_repo/

Length of output: 3452


Critical security risk: sensitive dotfiles may be exposed through repository structure API.

The change at lines 228-229 now only excludes .git, but previously excluded all hidden files. This exposes sensitive dotfiles like .env, .env.local, .npmrc, .pypirc, .aws/credentials, .ssh, and .gnupg to the structure listing.

Relying solely on gitignore filtering (lines 239-247) is insufficient because:

  • Not all projects maintain comprehensive gitignore files
  • Developers may forget to gitignore new secret-containing files
  • Some local configuration files are intentionally not gitignored for development purposes
  • No downstream path validation exists in get_file_content() to prevent access

Since this service remains in active use (despite being deprecated for backward compatibility), exposed sensitive file paths could be leveraged to extract secrets from local repositories.

Add explicit exclusions for common secret-containing files:

-                # Skip .git directory only
-                if item["name"] == ".git":
-                    continue
+                # Skip .git and common secret-containing dotfiles/directories
+                sensitive_patterns = {
+                    ".git", ".env", ".env.local", ".env.production", ".env.development",
+                    ".npmrc", ".pypirc", ".aws", ".ssh", ".gnupg", ".docker",
+                    ".kube", ".netrc", ".credentials"
+                }
+                if item["name"] in sensitive_patterns or (
+                    item["name"].startswith(".env.") or item["name"].startswith(".secret")
+                ):
+                    continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Skip .git directory only
if item["name"] == ".git":
# Skip .git and common secret-containing dotfiles/directories
sensitive_patterns = {
".git", ".env", ".env.local", ".env.production", ".env.development",
".npmrc", ".pypirc", ".aws", ".ssh", ".gnupg", ".docker",
".kube", ".netrc", ".credentials"
}
if item["name"] in sensitive_patterns or (
item["name"].startswith(".env.") or item["name"].startswith(".secret")
):
continue

continue

# Skip files with excluded extensions
Expand Down
Loading