Most machine learning hands you an answer and asks you to trust it. I'm more interested in systems that can show their work — where a model's output is checked against formal rules, not just assumed to be right because it sounds right.
I'm an AI engineer at Bluecascade and a BS Artificial Intelligence student at Emerson University, Multan (CGPA 3.77/4.0). My research pairs neural networks with SMT solvers (Z3) and reinforcement learning so their behavior is verifiable. Two papers so far — HALO and AXPEN — and on the engineering side I ship GenAI and computer-vision systems that real users actually use.
HALO — Hallucination-Aware Logic Oracle for Neuro-Symbolic LLM Verification · under review, JMLR 2026 · paper
There's a class of LLM hallucination that fact-checkers can't see: every claim looks individually plausible, but together they violate a domain law (a drug that's 89.7% effective and near-zero toxicity — each fine, jointly impossible). HALO extracts the claims, hands them to a Z3 SMT solver, and asks whether they're satisfiable against an axiom library. The claim extractor runs at 95.2% F1, and it flags cases that RAG, SelfCheckGPT, and semantic-entropy methods miss entirely.
Z3 / SMT · LLM verification · neuro-symbolic
AXPEN — Formally Constrained Autonomous Penetration Testing via Neuro-Symbolic RL · preprint, 2026
RL pentesting agents memorise the network they trained on and fall apart when it changes. AXPEN constrains the policy with logic instead: an LLM reads CVE text and writes formal preconditions, Z3 compiles them into an action mask, and that mask wraps a GraphSAGE PPO policy whose weights transfer to any host count. Trained on 3 hosts, it reaches 0.92 zero-shot attack-success on an unseen 8-host network (+0.907 over the unmasked baseline, p < 1e-3). Simulation-only, and it reports its negative results as carefully as its wins.
GPT-4o → Z3 mask · GraphSAGE PPO · zero-shot transfer
An 11-node LangGraph pipeline that runs a 4-phase Chain-of-Verification: it atomises a claim, gathers evidence across the web, PDFs, arXiv and PubMed, then pits a Proposer against a Skeptic to resolve contradictions. Local PII redaction, token budgeting, circuit breakers, and Groq/Ollama routing. Runs in Docker.
LangGraph FastAPI RAG Groq Ollama
A Transformer that reads a frontal + lateral X-ray and writes a structured radiology report (indication / findings / impression). The interesting parts:
- Bilinear cross-view fusion — outer-product interaction between the two views instead of naive concatenation, so the model weights view importance itself
- Gated cross-attention — a learnable per-layer gate that balances self- vs cross-attention
- Three section-aware decoders sharing one fused image encoder
- CheXNet (DenseNet-121) backbone, Bio_ClinicalBERT tokenizer, Grad-CAM for visual grounding, scored with RadGraph-F1 for clinical correctness
TensorFlow Bio_ClinicalBERT CheXNet Grad-CAM RadGraph
An 8M-parameter model built from the tokenizer up: BPE, masked-language pretraining, a span head for QA — trained on CPU. Then I swapped in three attention kernels (softmax, linear, RWKV) to measure the subquadratic accuracy/speed trade-off head-to-head.
PyTorch RWKV Linear attention
A VS Code extension that reviews your code with Hugging Face models as you write — linting and optimization hints inline.
JavaScript VS Code API Hugging Face
AI Engineer — Bluecascade · Sep 2025 – present Built Neonizer (live): upload one logo → a manufacturing mock-up and a full break-even price quote, automatically. I wrote the CV measurement stack behind it (PNG/SVG/CorelDraw input, content-tight cropping, distance-transform tube-width estimation, RANSAC line fitting) and internal BI pipelines that cut ~70% off the team's decision time.
Undergraduate Research Assistant — multi-modal learning · 2024 – 2025 Built the dual-view report generator above with a PhD researcher; studied CLIP, ALIGN, BLIP-2, ViLBERT.
Earlier internships — DeveloperHub · Cognifyz · CodeAlpha · 2024 – 2025 Anomaly detection, medical-image classification, NLP, recommendation systems, YOLO object detection — most shipped behind Flask.
Also: scikit-learn · GraphSAGE/GNNs · PPO/RL · LangChain · GPT-4o · Groq · Ollama · FAISS · spaCy · Whisper · YOLO · Redis · n8n · Airtable
Build Your Own Small Language Model From Scratch — Google · AI Agents & Transformers — Hugging Face · AI, Deep Learning & Communication — NAVTTC
