I look for workflows where agents can remove repeated manual effort, coordinate steps, call tools safely, or speed up decisions. If retrieval or normal software solves it cheaper, I do not add agent complexity.
Senior AI Engineer | Agentic AI | RAG | LangGraph | Production AI Systems
Senior AI engineer for production GenAI, RAG, agent workflows, and large-scale software systems.
I build AI platforms that know when to stay simple, when to retrieve, and when to route into bounded agents for safe tool actions, evaluations, observability, and rollback paths. Experience across Meta, Loudegg, Caesars Digital, Wells Fargo, and Tata Consultancy Services.
RAG
MCP
LangGraph
Kafka
AI Agents
- Designs routing, RAG, agents, tools, evals, and guardrails
- Ships frontend, backend, queues, caches, and APIs
- Explains tradeoffs clearly to engineers and leaders
> scan public proof: work, systems, impact
> unlock deep dives: architecture on request
> inspect dry runs: data flow + failure paths
Useful agents need permissions, private-data boundaries, typed tool schemas, approval gates, idempotency, audit logs, evals, and observability before they touch real work.
Many requests should stay RAG-only: retrieve, cite, answer, or refuse. I escalate to agent workflows only when the task needs state, tools, decisions, or long-running work.
APIs, queues, caches, databases, auth, cost budgets, rollout plans, and debugging paths are what turn AI ideas into reliable automation that helps a company move faster.
Intent routing, planner-executor flows, MCP tool contracts, human review, idempotency, and audit logs.
Ingestion, chunking, embeddings, hybrid search, reranking, citations, evals, and rollbackable indexes.
Typed services, auth, queues, caches, databases, observability, deployment, and user-facing product flows.
Kubernetes pods, Kafka backpressure, Redis hot paths, read replicas, partitions, rate limits, and incident playbooks.
Production AI Readiness Matrix
The questions I ask before an AI system is allowed near real users.
Senior AI engineering is not just model integration. Production systems need grounded retrieval, bounded agent routes when tools are justified, measurable quality, rollback paths, and clear explanations when production gets loud.
Study the AI Systems AtlasCitations, chunk IDs, source metadata, reranking, and refusal behavior when evidence is weak.
MCP schemas, RBAC, idempotency keys, dry-run paths, human review, and immutable audit trails.
Golden sets, grounding checks, latency/cost tracking, hallucination tests, and rollout gates.
Request IDs, prompt traces, retrieved chunks, tool calls, model output, latency, cost, and feedback.
Redis caching, async execution, batching, model tiering, token budgets, and retrieval pruning.
Kafka queues, dead-letter paths, backpressure, replay-safe consumers, rate limits, and fallbacks.
Policy-aware retrieval, ACL filters, PII handling, tenant boundaries, redaction, and access logs.
Feature flags, canaries, prompt/index versioning, dashboards, alerting, and rollback playbooks.
Selected Work
AI case studies built like systems, not like resume bullets.
Some pages are passcode protected because they explain architecture patterns at a deeper level. Deeper architecture pages are available to employers and interviewers on request.
All diagrams and examples are generalized, recreated, and sanitized for public demonstration. They do not include proprietary code, internal documents, private data, or confidential implementation details.
Built an internal AI assistant platform with route-aware retrieval, safe tool access, bounded agent paths, async services, Redis caching, and cost-aware model execution.
- FAISS retrieval over permission-scoped knowledge sources.
- MCP tools for policy-gated API and service access.
- Async execution, Redis caching, token tracking, and automated evals.
Architected a production route map that keeps simple requests on RAG, uses LangGraph for bounded tool workflows, and moves long-running coordination into async Kafka agents.
- 50K+ documents with deterministic chunking and versioned indexes.
- Pinecone/OpenSearch retrieval with policy-driven routing.
- LangGraph agent workflows, MCP tools, Langfuse/Datadog-style observability.
Built a trader-facing AI workflow and real-time sports data platform for odds, rules, investigations, and operational decision support.
- Hybrid RAG over historical odds, rules, and operational context.
- LangGraph agents for complex trader investigations and workflow support.
- Guardrails, audit logs, RBAC, cost controls, and 5M+ events/day pipelines.
Built enterprise market-data infrastructure for pricing, risk, reconciliation, and compliance reporting.
- Java Spring and Python services for market-data processing.
- 2M+ pricing records processed daily for FRTB/IMA risk computation.
- Automated Hadoop pipelines that reduced reconciliation time.
- Enabled faster compliance and risk reporting workflows.
Modernized legacy systems into scalable backend services and high-throughput validation pipelines.
- Spring Boot and Django microservices from legacy monoliths.
- Kafka-based trade validation processing high-volume transaction flows.
- Systems scaled for large concurrent user workloads.
- Production reliability focus across enterprise clients.
Built full-stack client products before the later AI platform work, covering web apps, mobile experiences, APIs, GraphQL, cloud deployment, and event-driven services.
- React, JavaScript, TypeScript, Python, Node.js, and GraphQL product builds.
- Microservices and Kafka-backed application workflows.
- AWS, GCP, and Azure deployments for high-traffic client workloads.
- End-to-end product execution from frontend experience to backend systems.
For Recruiters: Start Here
If you only have two to five minutes, scan this path first.
Start with the role fit, then scan the top three AI systems. Protected pages are available when an interviewer needs architecture depth.
Start with the Internal AI Assistant, Loudegg, and Caesars Digital.
Protected pages show routing choices, data movement, policy gates, and failure handling.
Use the Atlas if you want a visual read on RAG, agents, queues, caching, and scale.
Use email or the site chat to discuss protected details, interviews, or production AI roles.
Interviewer Decision Brief
Where I fit, what to review, and why the evidence is safe to share.
Use this as the quick read before a recruiter screen, hiring-manager review, or senior AI engineering interview loop.
Strongest fit when the role needs architecture ownership plus hands-on delivery: agent routing, retrieval, tools, backend systems, evals, observability, cost controls, and rollout discipline.
Those three show the highest-signal GenAI, RAG, agent/tool-use, and distributed-system work.
They show route decisions, dry runs, guardrails, failure paths, and clickable system flows.
No proprietary code, internal documents, private data, credentials, or confidential implementation details.
Best topics: agent routing, grounding, tool safety, eval design, latency, cost, rollback paths, and operator visibility.
Protected AI Deep Dives
Public proof up front. Architecture rooms available when employers need the details.
The homepage gives a fast scan of scope, outcomes, and production judgment. The locked pages go deeper with route decisions, agent/tool dry runs, guardrails, failure paths, and clickable system flows.
All diagrams and examples are generalized, recreated, and sanitized for public demonstration. They do not include proprietary code, internal documents, private data, or confidential implementation details.
Scan the card, open the case gate, then request access for the implementation-level walkthrough.
Agentic AI + Hybrid RAG Platform
ProtectedInternal AI Assistant
ProtectedCaesars Digital
ProtectedHigh-level outcomes, companies, technologies, and business impact stay visible to everyone.
Architecture pages, workflows, diagrams, and system details require a passcode.
The site behaves like a product: visual systems, dry runs, protected details, and proof.
Capabilities
The steps I take to make AI systems distinctly production-ready.
Strategy, data, design, development, launch, and growth are not separate worlds in real AI products. They are one loop, and every route has to be observable, testable, and safe.
Clarify the business workflow, risk level, users, latency budget, and where AI should not be used.
Design ingestion, chunking, embeddings, sparse search, vector search, versioning, and replay.
Make complex flows understandable with visual states, dry runs, citations, and clear failure paths.
Build typed services, agent state where needed, async jobs, tool layers, APIs, queues, caches, and deployment pipelines.
Use planner-executor and parent-child patterns only when routing, state, tools, and guardrails justify them.
Add evals, observability, access control, rate limits, cost controls, rollbacks, and incident visibility.
Measure answer quality, adoption, task success, model spend, latency, and workflow impact over time.
How I Think
My strongest work is designing the system around the LLM, not just calling it.
The portfolio is organized around production patterns: deterministic ingestion, hybrid retrieval, agent routers, planner-executor graphs, parent-child async agents, tool guardrails, cost-aware execution, and observability.
Versioned document ingestion, chunking, embeddings, sparse/vector indexes, replay and rollback.
Simple questions stay on RAG. Tool work enters bounded graphs. Long jobs become async workflows.
Planner-executor and parent-child patterns with max steps, typed state, idempotency, and safety gates.
OpenTelemetry, Langfuse-style traces, Datadog metrics, evals, feature flags, CI/CD, and rollout control.
Scroll System Replay
Follow one AI request through the production path.
A cinematic version of the work behind the case studies: input arrives, policy checks it, retrieval grounds it, agents/tools handle bounded work, and observability keeps the answer traceable.
Full-Stack Foundation
Earlier product work that made the AI systems stronger.
Kept lower on the page because the portfolio leads with production AI. This archive shows the full-stack foundation behind the AI work: web and mobile UX, auth, payments, maps, CRUD workflows, cloud deployment, and database-backed product experiences.
Designed and built sites for business owners who needed a more professional digital front door: clear service messaging, responsive layouts, calls to action, and deployment.
Built appointment, reservation, listing, contact, and management-style apps with authentication, database-backed CRUD, maps, payments, and admin-facing workflows.
Delivered mobile-first and hybrid app experiences using Android, Ionic, Angular, Firebase, maps, camera/profile flows, and real-time data patterns.
Past Software Work
Client software and earlier product builds from the full-stack foundation.
This bottom archive stays below the AI case studies so the top of the portfolio remains focused on Agentic AI and GenAI. These projects show past client platforms and supporting software work: marketplaces, course sales, digital products, mobile apps, cloud apps, and tooling.
Dental appointment marketplace connecting patients, verified dentists, and platform admins, with onboarding, booking conflict prevention, Stripe payments, reviews, notifications, search, and admin operations.
Online course-selling platform for AACCT, a table tennis agency, focused on packaging training content into a clear public course workflow.
Digital product-selling platform for packaging, presenting, and selling downloadable or online products through a clear customer-facing purchase workflow.
A manager-style workflow orchestration demo for document extraction, validation, formatting, and human-review patterns.
GitHub repoTable-tennis booking marketplace where players search and reserve tables from facility owners, with provider approval, real-time availability, Stripe payments, chat, reviews, notifications, search, admin governance, and analytics.
Native Android notes app using Java, XML, Material Design, SQL persistence, content providers, loaders, and durable CRUD behavior.
GitHub repoCoding-problem website built with HTML, CSS, and JavaScript, focused on making algorithmic problems more understandable through layout and animation.
GitHub repoNode.js command-line tool that automates git initialization, remote repo setup, `.gitignore` creation, and first push through an interactive wizard.
GitHub repoHotel room reservation platform with Angular, Spring Boot, reactive MongoDB, REST APIs, and user-facing booking workflows.
GitHub repoTechnical Range
AI engineering plus backend, distributed systems, and cloud execution.
Beyond Engineering
The same mindset shows up outside code: discipline, systems thinking, and long-game execution.
Competitive table tennis has shaped how I practice: fast feedback loops, pattern recognition, pressure control, and thousands of tiny improvements that compound.
I’m interested in durable systems, incentives, ownership, and long-horizon decision-making. That lens shapes how I think about engineering tradeoffs and resilient product design.
Contact
Want to review the protected architecture pages or talk production AI?
Email is the best way to reach me for senior AI engineering roles, GenAI and agentic platform work, architecture interviews, or access to protected case-study details.
Send a note directly or copy the email address for your recruiting workflow.