
Professional Summary
AI Engineer specializing in production-grade LLM systems and multi-agent orchestration, with deep expertise in building scalable Retrieval-Augmented Generation (RAG) pipelines on Microsoft Azure. Delivered 70% latency reduction (8.5s → 2.5s) and 55% cost savings ($12K → $5.4K) through vector search optimization, caching strategies, and efficient model orchestration. Designed high-precision intent-based routing systems achieving 95% classification accuracy for 1,000+ enterprise users. Strong experience across retrieval quality, LLM evaluation, observability, and guardrails, with proven ability to deploy end-to-end AI solutions from experimentation to full-stack production deployment, including React-based UIs, API-driven backends, and cloud-native services.
Technical Expertise
AI & LLM
System Design
Languages
Backend & Databases
Frontend
Cloud & DevOps
AI Engineer
- Architected "Sidekick" Multi-Agent AI Platform: Designed and deployed an enterprise LLM orchestration system using LangGraph with 4 specialized agents (HR, IT, Legal, General Query) serving 1,000+ employees; implemented intent classification using GPT-4 with few-shot prompting achieving 95% routing accuracy and 2s P95 latency through Redis caching & async execution.
- Built Production RAG System (HRGPT): Engineered end-to-end retrieval pipeline processing 50K+ ServiceNow documents; implemented hybrid search combining dense vectors (text-embedding-3-large, 3072-dim) and BM25 with Azure AI Search, achieving 0.89 MRR@10 and reducing irrelevant responses from 35% to 4% through metadata filtering and semantic reranking.
- Optimized RAG Performance & Quality: Reduced query latency from 8.5s to 2.5s (70% improvement) via multi-level caching (Redis L1, vector cache L2) and parallel retrieval; implemented query decomposition for multi-hop questions, improving answer completeness by 60% measured via GPT-4-as-judge evaluation framework.
- Engineered Real-time Data Ingestion Pipeline: Built incremental sync system processing 5K+ ServiceNow updates daily using Azure Functions and Service Bus; implemented change data capture with delta checksums, semantic chunking (500 tokens, 50 overlap), and batch upserts reducing ingestion time by 75% while maintaining 99.9% data freshness.
- Implemented LLM Observability & Guardrails: Deployed end-to-end tracing using LangSmith and Azure Application Insights tracking token usage, latency (P50/P95/P99), and error rates; built PII detection layer using regex + NER models preventing 200+ potential data leaks, and implemented RBAC-based context filtering ensuring users only access authorized documents.
- Deployed Cost Optimization Strategies: Reduced monthly OpenAI API costs by 55% ($12K to $5.4K) through prompt compression, streaming responses, and intelligent model routing (GPT-3.5-turbo for simple queries, GPT-4 for complex); implemented response caching achieving 40% cache hit rate for repeated queries.
- Automated ERP Workflows: Developed Node.js microservices integrating SAP FieldGlass, Coupa, and MS Teams enabling inline approvals via Adaptive Cards; automated 15K+ monthly supplier onboarding tasks reducing manual effort by 75% and approval cycle time from 5 days to 6 hours through webhook-driven workflows.
Software Developer Intern
- Developed a DAG Visualizer using React Flow to preview DAGs before deployment, generating interactive graphs for visualizing upstream and downstream task dependencies, reducing deployment errors by 30%.
- Created a robust Oracle Integration Cloud (OIC) integration to eliminate discrepancies in ledger postings, fully automating the process to save 24 hours of manual effort per month and minimizing risk of manual errors.
Software Developer Intern
- Implemented customizable Kendo grids and multi-step forms in React for school supply list management.
- Built APIs to retrieve and update quote request data in the database.
Backend Developer Intern
- Designed an algorithm with O(n) time complexity to reduce manual effort by 50%.
- Extracting and processing disability benefits data efficiently.

Real-Time Trading Demo
Click to enlarge

System Architecture
Click to enlarge
The Stock Exchange project was built using a microservices architecture to handle real-time trading of stocks. When a user places an order, the frontend sends a request to the API server, which publishes the order to a RabbitMQ broker. This ensures that the order is stored until the order book engine picks it up for processing. RabbitMQ provided high performance and fault tolerance by decoupling the services and ensuring reliable message delivery through features like message acknowledgments and durable queues.
For real-time updates, we used a WebSocket-based pub-sub model to scale the system and handle multiple clients efficiently. This model reduced WebSocket connection overhead and ensured low latency for real-time updates, making it highly scalable and reliable.
Used Docker for efficient deployment, utilised Prisma ORM for database manipulation, and Nginx for Load Balancing across websockets servers.
Technologies Stack
Open Source & Projects
A collection of my contributions to major open-source projects and personal technical experiments.
LightDash
Open SourceImplemented frontend functionality for auto-triggering threshold configuration modals using threshold uuid. Reduced manual configuration steps by 30%, improving setup time and enhancing the overall user experience for configuring alerts.
Cal.com
Open SourceEnhanced user experience by adding functionality to preserve the order of questions asked during the booking process, ensuring that confirmation emails are more intuitive and accurate.
Rocket.Chat
Open SourceNotify on Reactions feature: Enabled users to receive notifications for reactions. Improved user engagement by ~20% in team interactions. Collaborated closely with project maintainers.
Stock Exchange
Real-time stock trading simulation. Microservices architecture, Dockerized, high-performance fault tolerance. Nginx load balancing for websockets.
Social Media App
Cross-platform mobile app. Login via email, real-time messaging (Socket.io), live location sharing. Media sharing, scheduled messaging, groups.
Parents Portal
Student tracking portal: Marks, attendance, holidays. Forgot password (OTP). Parent complaints system. Staff login for data management.
Interior Design VR
Web VR app for room design. reduced effort by 90%. Drag & drop objects, color changing, scaling using Three.js controls.
LightDash
Open SourceImplemented frontend functionality for auto-triggering threshold configuration modals using threshold uuid. Reduced manual configuration steps by 30%, improving setup time and enhancing the overall user experience for configuring alerts.
Cal.com
Open SourceEnhanced user experience by adding functionality to preserve the order of questions asked during the booking process, ensuring that confirmation emails are more intuitive and accurate.
Rocket.Chat
Open SourceNotify on Reactions feature: Enabled users to receive notifications for reactions. Improved user engagement by ~20% in team interactions. Collaborated closely with project maintainers.
Stock Exchange
Real-time stock trading simulation. Microservices architecture, Dockerized, high-performance fault tolerance. Nginx load balancing for websockets.
Social Media App
Cross-platform mobile app. Login via email, real-time messaging (Socket.io), live location sharing. Media sharing, scheduled messaging, groups.
Parents Portal
Student tracking portal: Marks, attendance, holidays. Forgot password (OTP). Parent complaints system. Staff login for data management.
Interior Design VR
Web VR app for room design. reduced effort by 90%. Drag & drop objects, color changing, scaling using Three.js controls.
A showcase of my competitive coding journey, open source contributions, and professional certifications.
InterviewBit
Global Ranking
Ranked 4836 globally with over 300 solved questions.
Deep Learning
Specialization
Mastered NN, CNN, RNNs & Transformers by DeepLearning.AI.
Python For Everybody
Specialization
Comprehensive Python specialization by Univ. of Michigan.
Open Source
Contributor
Contributed to LightDash, Rocket.Chat and Cal.com.