OSINT Tracker

Confidence-aware OSINT CLI for collecting, correlating, and exporting open-source digital footprint signals.

🚀 Overview

OSINT Tracker is a Python command-line tool that helps you collect and organize publicly available intelligence indicators from multiple sources.

It was built as an educational, practical project to show how OSINT workflows can be made safer and more consistent through:

clear status labels (FOUND, NOT_FOUND, BLOCKED, UNKNOWN)
explicit confidence levels (HIGH, MEDIUM, LOW)
structured reporting and correlation insights
runtime guardrails for request volume and input safety

Real-world use cases include:

beginner-friendly OSINT learning and cybersecurity training
quick digital footprint checks during triage
generating repeatable JSON reports for small investigations

🔥 Features

Only implemented features are listed below.

Username search across GitHub, Reddit, Twitter, and Instagram. Uses a conservative HEAD -> GET strategy with platform-specific content markers and canonical URL path checks.
Confidence-aware account status model. Each username/social result returns status + confidence instead of only true/false.
Offline email analysis. Validates and normalizes email input, identifies provider, detects local-part pattern, classifies personal vs generic usage, and labels local-part length.
Live IP lookup via ipwho.is. Returns country/city/ISP/coordinates when available, with retries and structured failure reasons (for example rate_limited, request_failed).
Social scan built from username results. Reuses username signals and converts them to social availability states (available, not available, blocked, unknown).
Local metadata extraction. Reads file name, path, size, MIME type, created/modified timestamps, and SHA-256 hash.
Cross-source correlation engine. Produces weighted, human-readable insights and an Overall confidence label, while treating social data as DERIVED (supporting, not independent proof).
JSON report export. Saves a stable schema to both output/report.json and output/reports/result.json.
Runtime safety controls. Includes input length limits, max network operations per run, shared HTTP request budget, and optional pacing delays.

🛠 Tech Stack

Language: Python
Runtime library: requests
Testing: pytest
Dev tooling: black, flake8
Python standard library used heavily: argparse, concurrent.futures, ipaddress, pathlib, hashlib, json, datetime, threading, mimetypes

📂 Project Structure

osint-tracker/
├── main.py
├── requirements.txt
├── requirements-dev.txt
├── project.md
├── src/
│   ├── __init__.py
│   ├── core/
│   ├── modules/
│   │   ├── __init__.py
│   │   ├── username_search.py
│   │   ├── email_lookup.py
│   │   ├── ip_lookup.py
│   │   ├── social_scan.py
│   │   └── metadata_extractor.py
│   └── utils/
│       ├── __init__.py
│       ├── request_context.py
│       ├── correlation.py
│       └── formatter.py
├── tests/
│   ├── test_cli.py
│   ├── test_username_search.py
│   ├── test_email_lookup.py
│   ├── test_ip_lookup.py
│   ├── test_correlation.py
│   └── test_report_schema.py
└── output/
    ├── report.json
    └── reports/
        └── result.json

Folder guide for beginners:

main.py -> CLI entry point. Parses flags, runs modules, handles errors, prints results, and exports reports.
src/modules/ -> Core feature modules (username, email, IP, social, metadata).
src/utils/request_context.py -> Shared request budget and delay logic used by network modules.
src/utils/correlation.py -> Combines multi-source outputs into readable investigative insights.
src/utils/formatter.py -> Formats terminal output and builds/exports JSON report payloads.
src/core/ -> Present in the project structure but currently empty.
tests/ -> Automated tests for CLI behavior, module logic, correlation rules, and report schema consistency.
output/ -> Generated report artifacts.

⚙️ Setup & Installation

git clone https://github.com/urvalkheni/osint-tracker.git
cd osint-tracker
python -m pip install -r requirements.txt
python main.py

What each step does:

git clone ... downloads the project.
cd osint-tracker moves into the project folder.
pip install -r requirements.txt installs runtime dependency (requests).
python main.py runs the CLI (with no flags, it shows help).

Optional (development tools):

python -m pip install -r requirements-dev.txt

Run tests:

pytest -q

🧪 Usage Examples

Check where a username appears:

python main.py --username octocat

Analyze an email offline (no network call):

python main.py --email demo.user@gmail.com

Lookup IP geolocation/network details:

python main.py --ip 8.8.8.8

Run social scan from username-based signals:

python main.py --social octocat

Extract local file metadata:

python main.py --metadata project.md

Run multiple sources and export a report:

python main.py --username octocat --email demo.user@gmail.com --ip 8.8.8.8 --output

Useful CLI flags:

--username USERNAME   Search username across configured platforms
--email EMAIL         Run offline email analysis
--ip IP_ADDRESS       Run IP intelligence lookup
--social USERNAME     Build social scan from username results
--metadata FILE_PATH  Extract local file metadata
--output              Export JSON report files
-v, --verbose         Show detailed errors (stack trace)

📊 Example Output

Example terminal output (real format, sample values):

[+] Username Analysis

Target: octocat

GitHub     -> FOUND | confidence=HIGH | HTTP 200 | HEAD->GET (https://github.com/octocat)
Reddit     -> NOT_FOUND | confidence=HIGH | HTTP 404 | HEAD (https://www.reddit.com/user/octocat)
Twitter    -> BLOCKED | confidence=MEDIUM | HTTP 200 | HEAD->GET | Blocked by platform (https://twitter.com/octocat)
Instagram  -> UNKNOWN | confidence=LOW | HTTP 200 | HEAD->GET | Unable to determine account status (https://www.instagram.com/octocat)

Note: Result may be affected by platform restrictions or anti-bot protections

[+] Correlation Summary

- Username found on GitHub
- Social source quality: DERIVED
- Overall confidence: Medium

Example exported JSON shape (trimmed):

{
    "username": {
        "query": "octocat",
        "original_query": "octocat",
        "normalized": false,
        "normalization_reason": null,
        "results": []
    },
    "email": {
        "email": "demo.user@gmail.com",
        "provider": "Google",
        "type": "personal",
        "confidence": "LOW"
    },
    "ip": {
        "ip": "8.8.8.8",
        "status": "SUCCESS"
    },
    "social": {
        "query": "octocat",
        "original_query": "octocat",
        "normalized": false,
        "normalization_reason": null,
        "results": []
    },
    "metadata": {
        "query": "project.md",
        "result": {
            "sha256": "..."
        }
    },
    "correlation": [
        "Overall confidence: Medium"
    ],
    "execution": {
        "timestamp_utc": "2026-...",
        "inputs_used": {
            "username": "octocat",
            "output": true
        }
    }
}

🧠 How It Works

High-level flow:

Input: You provide one or more CLI flags (--username, --email, --ip, --social, --metadata).
Validation and safety checks: The app validates input types/length and enforces run-level safety limits.
Module execution: Selected modules run and return structured dictionaries/lists.
Formatting: Results are printed in consistent CLI sections.
Correlation (only when at least 2 sources are provided): Cross-source insights are generated with source-quality and confidence weighting.
Output export (optional): With --output, report JSON is written to output/report.json and output/reports/result.json.

🔐 Limitations / Notes

Username and IP modules depend on external services and live HTTP behavior.
Platform HTML/behavior changes can reduce detection quality.
Anti-bot pages, login walls, and CAPTCHA can produce BLOCKED or UNKNOWN.
Email analysis is heuristic and intentionally labeled low confidence.
Social scan is derived from username results and is not an independent confirmation source.
Correlation output is guidance for investigation, not legal proof of identity.

Runtime safety environment variables:

OSINT_MAX_TARGET_INPUT_LENGTH (default 256)
OSINT_MAX_NETWORK_OPERATIONS_PER_RUN (default 3)
OSINT_MAX_HTTP_REQUESTS_PER_RUN (default 30)
OSINT_REQUEST_DELAY_SEC (default 0)
OSINT_MODULE_DELAY_SECONDS (default 0)

🎯 Learning Outcomes

This project demonstrates practical skills in:

CLI application design with argparse
HTTP data collection with retries, timeouts, and error contracts
confidence-aware heuristic analysis
multi-source data correlation and signal weighting
structured report schema design and JSON export
defensive coding and safety controls for network tooling
test-driven validation with mocked external dependencies

🚀 Future Improvements

Add optional asynchronous request mode for faster large checks.
Expand platform adapters with versioned marker profiles.
Add CSV export and configurable output locations.
Add richer metadata extraction for common document formats.

⚠️ Disclaimer

This tool is for educational and authorized security research only. Use it only on targets and data sources you are legally allowed to investigate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OSINT Tracker

🚀 Overview

🔥 Features

🛠 Tech Stack

📂 Project Structure

⚙️ Setup & Installation

🧪 Usage Examples

📊 Example Output

🧠 How It Works

🔐 Limitations / Notes

🎯 Learning Outcomes

🚀 Future Improvements

⚠️ Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

OSINT Tracker

🚀 Overview

🔥 Features

🛠 Tech Stack

📂 Project Structure

⚙️ Setup & Installation

🧪 Usage Examples

📊 Example Output

🧠 How It Works

🔐 Limitations / Notes

🎯 Learning Outcomes

🚀 Future Improvements

⚠️ Disclaimer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages