Skip to content

Commit 515ba26

Browse files
authored
Merge pull request #99 from mongodb/development
v1.46.0
2 parents bfee344 + 96f1784 commit 515ba26

18 files changed

+421
-143
lines changed

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,11 @@
22

33
All notable changes to this project will be documented in this file.
44

5+
## [1.46.0]
6+
- Improved rules: AWS, pem
7+
- Added rule for Ollama, Weights and Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, zhipu
8+
- Added `self-update` command to update the binary independently. Now supports updating over homebrew managed binary
9+
510
## [1.45.0]
611
- Added `--repo-artifacts` flag to scan repository issues, gists/snippets, and wikis when cloning via `--git-url`
712
- Added rules for sendbird, mattermost, langchain, notion

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ publish = false
1010

1111
[package]
1212
name = "kingfisher"
13-
version = "1.45.0"
13+
version = "1.46.0"
1414
description = "MongoDB's blazingly fast secret scanning and validation tool"
1515
edition.workspace = true
1616
rust-version.workspace = true

README.md

Lines changed: 28 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,12 @@
88
Kingfisher is a blazingly fast secret‑scanning and live validation tool built in Rust. It combines Intel’s hardware‑accelerated Hyperscan regex engine with language‑aware parsing via Tree‑Sitter, and **ships with hundreds of built‑in rules** to detect, validate, and triage secrets before they ever reach production
99
</p>
1010

11-
Kingfisher originated as a fork of Praetorian's Nosey Parker, and is built atop their incredible work and the work contributed by the Nosey Parker community.
12-
13-
## What Kingfisher Adds
14-
- **Live validation** via cloud-provider APIs
15-
- **Extra targets**: GitLab repos, S3 buckets, Docker images, Jira issues, Confluence pages, and Slack messages
16-
- **Compressed Files**: Supports extracting and scanning compressed files for secrets
17-
- **Baseline mode**: ignore known secrets, flag only new ones
18-
- **Allowlist support**: suppress false positives with custom regexes or words
19-
- **Language-aware detection** (source-code parsing) for ~20 languages
20-
- **Native Windows** binary
21-
11+
Originally forked from Praetorian’s Nosey Parker, Kingfisher adds live cloud-API validation; many more targets (GitLab, S3, Docker, Jira, Confluence, Slack); compressed-file extraction and scanning; baseline and allowlist controls; language-aware detection (~20 languages); and a native Windows binary. See [Origins and Divergence](#origins-and-divergence) for details.
2212

2313
## Key Features
2414
- **Performance**: multithreaded, Hyperscan‑powered scanning built for huge codebases
2515
- **Extensible rules**: hundreds of built-in detectors plus YAML-defined custom rules ([docs/RULES.md](/docs/RULES.md))
16+
- **Broad AI SaaS coverage**: finds and validates tokens for OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, Together.ai, Zhipu, and many more
2617
- **Multiple targets**:
2718
- **Git history**: local repos or GitHub/GitLab orgs/users
2819
- **Repository artifacts**: with `--repo-artifacts`, scan GitHub/GitLab repository artifacts such as issues, pull/merge requests, wikis, snippets, and owner gists in addition to code
@@ -154,18 +145,18 @@ docker run --rm \
154145

155146
# 🔐 Detection Rules at a Glance
156147

157-
Kingfisher ships with hundreds of rules that cover everything from classic cloud keys to the latest LLM-API secrets. Below is an overview:
148+
Kingfisher ships with [hundreds of rules](/data/rules/) that cover everything from classic cloud keys to the latest AI SaaS tokens. Below is an overview:
158149

159150
| Category | What we catch |
160151
|----------|---------------|
161-
| **AI / LLM APIs** | OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), and more
162-
| **Cloud Providers** | AWS, Azure, GCP, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, and more
163-
| **Dev & CI/CD** | GitHub/GitLab tokens, CircleCI, TravisCI, TeamCity, Docker Hub, npm, PyPI, and more
164-
| **Messaging & Comms** | Slack, Discord, Microsoft Teams, Twilio, Mailgun, SendGrid, Mailchimp, and more
165-
| **Databases & Data Ops** | MongoDB Atlas, PlanetScale, Postgres DSNs, Grafana Cloud, Datadog, Dynatrace, and more
166-
| **Payments & Billing** | Stripe, PayPal, Square, GoCardless, and more
167-
| **Security & DevSecOps** | Snyk, Dependency-Track, CodeClimate, Codacy, OpsGenie, PagerDuty, and more
168-
| **Misc. SaaS & Tools** | 1Password, Adobe, Atlassian/Jira, Asana, Netlify, Baremetrics, and more
152+
| **AI SaaS APIs** | OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Stability AI, Replicate, xAI (Grok), Ollama, Langchain, Perplexity, Weights & Biases, Cerebras, Friendli, Fireworks.ai, NVIDIA NIM, together.ai, Zhipu, and more |
153+
| **Cloud Providers** | AWS, Azure, GCP, Alibaba Cloud, DigitalOcean, IBM Cloud, Cloudflare, and more |
154+
| **Dev & CI/CD** | GitHub/GitLab tokens, CircleCI, TravisCI, TeamCity, Docker Hub, npm, PyPI, and more |
155+
| **Messaging & Comms** | Slack, Discord, Microsoft Teams, Twilio, Mailgun, SendGrid, Mailchimp, and more |
156+
| **Databases & Data Ops** | MongoDB Atlas, PlanetScale, Postgres DSNs, Grafana Cloud, Datadog, Dynatrace, and more |
157+
| **Payments & Billing** | Stripe, PayPal, Square, GoCardless, and more |
158+
| **Security & DevSecOps** | Snyk, Dependency-Track, CodeClimate, Codacy, OpsGenie, PagerDuty, and more |
159+
| **Misc. SaaS & Tools** | 1Password, Adobe, Atlassian/Jira, Asana, Netlify, Baremetrics, and more |
169160

170161
## Write Custom Rules!
171162

@@ -543,9 +534,11 @@ Kingfisher automatically queries GitHub for a newer release when it starts and t
543534

544535
- **Hands-free updates** – Add `--self-update` to any Kingfisher command
545536

546-
* If a newer version exists, Kingfisher will download it, replace the running binary, and re-launch itself with the **exact same arguments**.
537+
* If a newer version exists, Kingfisher will download it, replace the running binary, and re-launch itself with the **exact same arguments**.
547538
* If the update fails or no newer release is found, the current run proceeds as normal
548539

540+
- **Manual update** – Run `kingfisher self-update` to update the binary without scanning
541+
549542
- **Disable version checks** – Pass `--no-update-check` to skip both the startup and shutdown checks entirely
550543

551544
# Advanced Options
@@ -661,6 +654,20 @@ Use `--rule-stats` to collect timing information for every rule. After scanning,
661654
kingfisher scan --help
662655
```
663656

657+
658+
## Origins and Divergence
659+
660+
Kingfisher began as a fork of Praetorian’s Nosey Parker, as our experiment with adding live validation support and embedding that validation directly inside each rule.
661+
662+
Since that initial fork, it has diverged heavily from Nosey Parker:
663+
- Replaced the SQLite datastore with an in-memory store + Bloom filter
664+
- Collapsed the workflow into a single scan-and-report phase with direct JSON/BSON/SARIF outputs
665+
- Added Tree-Sitter parsing on top of Hyperscan for deeper language-aware detection
666+
- Removed datastore-driven reporting/annotations in favor of live validation, baselines, allowlists, and compressed-file extraction
667+
- Expanded support for new targets (GitLab, Jira, Confluence, Slack, S3, Docker, etc.)
668+
- Delivered cross-platform builds, including native Windows
669+
670+
664671
# Roadmap
665672

666673
- More rules

data/rules/aws.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ rules:
55
(?xi)
66
\b
77
(
8-
(?:AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
8+
(?:A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
99
[2-7A-Z]{16}
1010
)
1111
\b
@@ -21,15 +21,15 @@ rules:
2121
(?xi)
2222
(?:
2323
\b
24-
(?:AWS|AMAZON|AMZN|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
24+
(?:AWS|AMAZON|AMZN|A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
2525
(?:.|[\n\r]){0,32}?
2626
\b
2727
(
2828
[A-Z0-9/+=]{40}
2929
)
3030
\b
3131
|
32-
\b(?:AWS|AMAZON|AMZN|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
32+
\b(?:AWS|AMAZON|AMZN|A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)
3333
(?:.|[\n\r]){0,96}?
3434
(?:SECRET|PRIVATE|ACCESS)
3535
(?:.|[\n\r]){0,16}?

data/rules/cerebras.yml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
rules:
2+
- name: Cerebras AI API Key
3+
id: kingfisher.cerebras.1
4+
pattern: |
5+
(?xi)
6+
\b
7+
(
8+
csk-[a-z0-9]{48}
9+
)
10+
\b
11+
confidence: medium
12+
min_entropy: 3.0
13+
validation:
14+
type: Http
15+
content:
16+
request:
17+
method: GET
18+
url: "https://api.cerebras.ai/v1/models"
19+
headers:
20+
Authorization: "Bearer {{ TOKEN }}"
21+
response_matcher:
22+
- report_response: true
23+
- type: StatusMatch
24+
status: [200]
25+
- type: WordMatch
26+
words:
27+
- '"object"'
28+
- '"data"'
29+
match_all_words: true
30+
references:
31+
- https://docs.cerebras.net/
32+
examples:
33+
- "csk-6nptf4w5cx36fw58t3hkx48jvm52wm693pex5tjm29kn55yt"
34+
- "csk-e2knhj8h3h4erp6crfx6rh52tvecj4xnwmtjf3mtrvtt54et"
35+
- "csk-rhw8npjrp6kpv9phm55n5nv5rkkm4492jepx3yh65dc9cwe9"
36+
- "csk-w6p3nxk3dc5249mrpmv642fffert28rwdkepffrpn8rtfr9h"

data/rules/fireworksai.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
rules:
2+
- name: Fireworks.ai API Key
3+
id: kingfisher.fireworks.1
4+
pattern: |
5+
(?xi)
6+
\b
7+
(
8+
fw_[A-Z0-9]{24}
9+
)
10+
\b
11+
confidence: medium
12+
min_entropy: 3.5
13+
validation:
14+
type: Http
15+
content:
16+
request:
17+
method: GET
18+
url: "https://api.fireworks.ai/inference/v1/models"
19+
headers:
20+
Authorization: "Bearer {{ TOKEN }}"
21+
response_matcher:
22+
- report_response: true
23+
- type: StatusMatch
24+
status: [200]
25+
- type: WordMatch
26+
words:
27+
- '"owned_by"'
28+
- '"data"'
29+
match_all_words: true
30+
references:
31+
- https://readme.fireworks.ai/reference/getting-started-with-the-api
32+
examples:
33+
- "fw_3ZL5ji26Tp7baYrW5S2pA5xi"
34+
- "fw_3ZaW5fSpx5GTnHpRGb8CPu2V"
35+
- "fw_3ZSU8ymvmZ38YPv8uwbZHAyW"

data/rules/friendli.yml

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
rules:
2+
- name: Friendli.ai API Key
3+
id: kingfisher.friendli.1
4+
pattern: |
5+
(?xi)
6+
\b
7+
(
8+
flp_[A-Z0-9]{46}
9+
)
10+
\b
11+
confidence: medium
12+
min_entropy: 3.0
13+
validation:
14+
type: Http
15+
content:
16+
request:
17+
method: GET
18+
url: "https://api.friendli.ai/dedicated/beta/endpoint"
19+
headers:
20+
Authorization: "Bearer {{ TOKEN }}"
21+
Content-Type: "application/json"
22+
response_matcher:
23+
- report_response: true
24+
- type: StatusMatch
25+
status: [200]
26+
- type: WordMatch
27+
words:
28+
- '"data"'
29+
- '"status"'
30+
references:
31+
- https://docs.friendli.ai/reference/authentication
32+
examples:
33+
- "flp_eb8CAc1OHdVISFraFZXFYQeH1CYtqM2VdYFvV1duniWw32"
34+
- "flp_fYvncz2Ahh4YEfSKbNoT09DWlwPq5I7svZG2l1bdbpOg1c"
35+
- "flp_kGcjWhZQ4zYQnY7b3O6nukAhflKZJeS7pNDhs79IRrfodc"

data/rules/mailgun.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@ rules:
22
- name: MailGun Token
33
id: kingfisher.mailgun.1
44
pattern: |
5-
(?xi)
6-
\b
5+
(?xi)
6+
\b
77
mailgun
88
(?:.|[\n\r]){0,32}?
99
(?:SECRET|PRIVATE|ACCESS|KEY|TOKEN)

data/rules/nvidia.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
rules:
2+
- name: NVIDIA NIM API Key
3+
id: kingfisher.nvidia.nim.1
4+
pattern: |
5+
(?xi)
6+
\b
7+
(
8+
nvapi-[A-Z0-9_-]{60,70}
9+
)
10+
\b
11+
confidence: medium
12+
min_entropy: 3.5
13+
examples:
14+
- "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"
15+
- "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"
16+
- "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"
17+
validation:
18+
type: Http
19+
content:
20+
request:
21+
method: GET
22+
url: "https://api.nvcf.nvidia.com/v2/nvcf/functions"
23+
headers:
24+
Authorization: "Bearer {{ TOKEN }}"
25+
response_matcher:
26+
- report_response: true
27+
- type: StatusMatch
28+
status: [200]
29+
- type: WordMatch
30+
words: ["id", "versionId"]

data/rules/ollama.yml

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
rules:
2+
- name: Ollama API Key
3+
id: kingfisher.ollama.1
4+
pattern: |
5+
(?xi)
6+
\b
7+
ollama
8+
(?:.|[\n\r]){0,32}?
9+
\b
10+
(
11+
[a-f0-9]{32}\.[a-zA-Z0-9_-]{24}
12+
)
13+
confidence: medium
14+
min_entropy: 3.5
15+
validation:
16+
type: Http
17+
content:
18+
request:
19+
method: POST
20+
url: https://ollama.com/api/generate
21+
headers:
22+
Content-Type: application/json
23+
# Turbo keys are sent as the raw value in Authorization (no "Bearer " prefix)
24+
# per working client behavior.
25+
Authorization: "{{ TOKEN }}"
26+
body: |
27+
{
28+
"model": "gpt-oss:20b",
29+
"prompt": "ping",
30+
"stream": false
31+
}
32+
response_matcher:
33+
- report_response: true
34+
- type: StatusMatch
35+
status: [200]
36+
- type: WordMatch
37+
words:
38+
- '"response":'
39+
- '"done":true'
40+
references:
41+
- https://ollama.com/blog/turbo
42+
examples:
43+
- "ollama key = 8bcdd9b4e28e4e1b8bf14a2eb8701220.QH5p5TU2BDwzHu5_RCtvJXsj"
44+
- "ollama key = e56714bd7c1146e4b4801244bc2bc67a.3GAswjZGZ5YY6Qdgt0xg56vM"
45+
- "ollama key = 872658d00c284033a707abf1725d4b6c.-4JpTp0dQHmf0nb89xI-wgP-"
46+
- "ollama key = 0c4e6bf1222c4ffc87025a7a9ffd5cac.z-fgt1JO9-LadzA2cL23qLH3"
47+
- "ollama key = dae874a007d442cdb807910c4c57c6f5.B_aHUSdeAe42UR-X41StUFJq"

0 commit comments

Comments
 (0)