I bridge the gap between complex engineering systems and the developers who build on them - through content, community, speaking, and hands-on technical storytelling.
"The best reliability engineering is the kind developers never have to think about - because someone taught them how to do it right."
- π£οΈ Tech Speaker - AgentCamp 2026, AWS SCD Pune & community events
- π₯ Community Volunteer - AWS User Group Pune | Atlassian Community
- βοΈ Technical Blogger - SRE, DevOps, GenAI @ hashnode.com/@Vershobhit
- π± Content Creator - 50K+ followers across @TheJugaadSRE & @TheJugaadGuy
- π¬ Filmmaker - Short film writer & director, international festival recognition
- π Traveller - 12 countries, cross-cultural perspectives in everything I build
I believe developer experience is a reliability problem in disguise.
When developers have great tooling, great documentation, and great communities around them - systems become more reliable, incidents become rarer, and engineering teams actually enjoy their work.
My approach: Build in public. Teach through demos. Make the complex feel obvious.
- Observability & SRE - ELK Stack, Dynatrace, SLI/SLO, incident intelligence
- Cloud & Platform - AWS, Azure, Terraform, Kubernetes, ArgoCD
- Security & Compliance - Cloud security, vulnerability governance, PSD2 (UK banking regulation)
- AI & Agentic Workflows - Claude, GPT, Kiro - building real automation, not demos
- DevSecOps - CI/CD pipelines, Jenkins, GitHub Actions, platform reliability
| Project | What it does |
|---|---|
| SRE GPT Dashboard | AI-powered Dynatrace incident analysis using LLMs |
| SRE Playbooks | Real-world incident response workflows, open sourced |
| Career Referee for Engineers | Decision engine - SRE vs DevOps vs Platform vs Dev |
| Vajra AI | Agentic AI reliability tooling for production systems |
| My Website | Created my website and hosted it on Github itself |
| Date | Platform | Contribution | Status |
|---|---|---|---|
| Jun 2026 | π Grafana | K8s SRE Incident Response Dashboard - 8 panels for on-call triage | β Published |
| Jun 2026 | π΄ SigNoz | K8s SRE Incident Response Dashboard - PR #376 | π Under Review |
Built for the on-call engineer who gets paged at 2am - not the capacity planner browsing metrics.
Covers pod restarts, node pressure, p99 SLI, error budget burn, and PVC storage failures.
Two platforms. One day. 14 years of real SRE experience distilled into 8 panels.
β
AgentCamp 2026 - Agentic AI in SRE workflows
β
Calyon AI - Agentic AI
β
AWS SCD Pune - Cloud observability & reliability
β
Internal Bootcamps - ELK, AWS, Terraform (600+ engineers)
β
College & Cloud Clubs - Observability & engineering practices
β
Toastmasters - District-level International Speech Contest 2023-24
15+ Years in platform & reliability engineering
50K+ Community followers across socials
10+ Tech talks delivered
600+ Engineers onboarded/mentored
100+ Projects led across cloud, SRE, and AI
12 Countries explored (and counting)
---
---
## π Find Me
[](https://linkedin.com/in/vershobhit)
[](https://instagram.com/thejugaadsre)
[](https://hashnode.com/@Vershobhit)
[](https://infrawithshobhit.github.io/mywebsite/)
[](https://youtube.com/@TheJugaadsre)
---
## π¬ Currently Excited About
- π€ Agentic AI workflows in production reliability
- π Developer-first application security (DevSecOps)
- π§© Building communities where engineers learn by doing
- πΉ Technical content that makes complex systems feel approachable
---
*Open to DevRel, Developer Advocacy, and Technical Evangelism opportunities. Let's build something the community will love.*