I build AI systems that actually work in production — not just Jupyter notebooks.
For the past couple of years I've been obsessed with one question: how do you take a powerful language model and turn it into something a real business can rely on? That's led me down the rabbit hole of agentic workflows, RAG pipelines, fine-tuning, and full-stack deployment — and I haven't come up for air since.
Right now I'm finishing my CS degree at Islamia College University Peshawar (CGPA 3.75, graduating June 2026) while working as an AI Engineer apprentice at Orel Vision, where I'm shipping things like OCR pipelines, AutoReply agents, and client-facing chatbots. Before that, I built Photomonix at NeatNode — a production GenAI image enhancement platform that went from prototype to live users.
I don't list every library I've ever imported. Here's what I can own end-to-end:
Agentic Systems & LLMs
Multi-agent orchestration, tool-use pipelines, LangChain, LlamaIndex, OpenRouter integrations, and prompt engineering that holds up under real user behavior — not just demo conditions.
RAG & Vector Search
Retrieval-Augmented Generation from scratch: chunking strategies, embedding selection, ChromaDB, semantic search, hybrid retrieval, and making sure the system doesn't hallucinate on edge cases.
Fine-Tuning
LoRA and QLoRA on LLaMA models using Unsloth. I've worked with medical chain-of-thought datasets — the kind of fine-tuning where correctness actually matters.
Full-Stack AI Deployment
FastAPI backends, Streamlit and React frontends, Docker containers, and multi-model fallback chains (I once built a 7-model fallback using OpenRouter — because production systems can't afford single points of failure).
🤖 AutoReply Agent — A production social media chatbot for WhatsApp, Instagram, and TikTok. React + FastAPI + SQLite + OpenRouter with a 7-model fallback chain. Built for a real client, handles real traffic.
📸 Photomonix — GenAI image enhancement platform built at NeatNode. My first taste of what it takes to go from a working model to something you can hand to users.
📄 OCR Pipeline — Advanced document processing with auto-deskew, CLAHE preprocessing, and Lanczos4 upscaling. Built at Orel Vision for production document workflows.
🧬 LLM Fine-Tuning — Fine-tuned LLaMA models with LoRA/QLoRA on medical CoT data using Unsloth. Learned quickly that the dataset quality matters more than almost anything else.
🔍 RAG Document QA — LangChain + ChromaDB + OpenRouter with a Streamlit interface. The project that made me understand retrieval failure modes at a deep level.
🎤 Voice Cloning System — Open-source TTS using Hugging Face models, optimized for local inference on Windows and Google Colab.
📊 SkillForge (in progress) — An AI-powered skill mastery platform with a JD-matched interview engine. Three-agent architecture, MySQL backend, FastAPI + React. Building this because I couldn't find a tool that prepares you for the specific job you're applying for.
Languages Python
LLM/AI LangChain · LlamaIndex · Hugging Face · OpenRouter · Gemini API
LoRA/QLoRA fine-tuning · Prompt Engineering · RAG Systems
ML PyTorch · TensorFlow · Scikit-learn · Pandas · NumPy
Databases MySQL · ChromaDB · SQLite · Vector DBs · SQLAlchemy ORM
Backend FastAPI · REST APIs
Frontend React · Streamlit · Gradio
DevOps Docker · Git
Certifications Microsoft Azure AI Fundamentals · IBM ML (Coursera)
MLOps at scale, LLM quantization and optimization, advanced system design for high-throughput AI workloads. I write this not as a disclaimer, but because I think engineers who know exactly where their edge is learn faster than those who don't.
A full-time AI/ML Engineer role where I can contribute immediately on real problems — ideally somewhere that ships fast and cares about the quality of what they build. I'm open to remote and on-site, and I'm graduating in June 2026.
If you're building something interesting with LLMs, agentic systems, or production AI infrastructure, I'd genuinely love to talk.
📧 [Shahabkhan2799@gmail.com]
💼 [www.linkedin.com/in/shahab-khan-8361012b1]