Umesh Kedimi

Umesh Kedimi

Available for production AI work

I build production Agentic AI systems.

Senior Software Engineer · Agentic AI Engineer · AI Platform Engineer

Senior engineer with 9+ years building reliable backend systems and platform foundations. I take AI from demos and prototypes into scalable, observable, enterprise-ready platforms.

9+
Years engineering
Agentic AI
Primary focus
Production
Not prototypes
umesh@portfolio — zsh

umesh@portfolio ~ % whoami

Senior Software Engineer · Agentic AI Engineer

umesh@portfolio ~ % cat focus.txt

Production Agentic AI · Multi-Agent Systems · AI Platforms

umesh@portfolio ~ % ls ~/stack

python fastapi llms mcp rag postgres redis k8s

umesh@portfolio ~ % echo $MISSION

Take AI from demos to reliable, enterprise-ready systems.

umesh@portfolio ~ %
01About

I take AI from prototypes to production.

Most AI never survives contact with reality. My work lives in the gap between an impressive demo and a system a company can actually trust.

I'm a senior software engineer with 9+ years of experience building reliable backend systems, distributed architectures, and platform foundations. Today my focus is production Agentic AI — the engineering that turns language models into dependable products.

I enjoy the hard problems at the intersection of AI, backend systems, cloud infrastructure, and software architecture: making agents observable, evaluable, secure, and cost-aware enough to run at enterprise scale.

My goal is to become a recognized subject-matter expert in Agentic AI and AI platform engineering — and to build systems that solve meaningful business problems, not just impress in a notebook.

Currently focused on

Production Agentic AIEnterprise AI ArchitectureMulti-Agent SystemsAI Platform EngineeringAI InfrastructureLLM OrchestrationAI EvaluationMCPRAGAI Security
02What I Build

Systems, not scripts.

The work that decides whether AI ships: the platform, the orchestration, and the infrastructure underneath the model.

01

Agentic AI Platforms

Enterprise platforms where autonomous agents reason, plan, and act — with memory, guardrails, and human approval built in from the start.

02

Multi-Agent Systems

Orchestrated agent topologies that decompose complex tasks, coordinate tools, and recover from failure instead of falling over.

03

AI Infrastructure

The unglamorous layer that makes AI trustworthy: routing, observability, evaluation harnesses, cost controls, and safe rollouts.

04

RAG & Retrieval

Retrieval pipelines that stay accurate at scale — grounded answers, freshness, and evaluation rather than vibes.

05

MCP Tool Gateways

Model Context Protocol servers that expose tools and data to agents securely, with scoping, auditing, and rate control.

06

LLM Orchestration APIs

FastAPI services that turn model calls into reliable, versioned, observable products with clean contracts.

03Skills

The toolkit.

The languages, frameworks, and infrastructure I reach for when building reliable AI systems.

$

Languages & APIs

PythonFastAPIREST APIsPydanticSQLModel
$

AI & Agents

Agentic AILLMsAI AgentsMulti-Agent SystemsMCPRAGPrompt EngineeringAI System Design
$

Data & Infrastructure

PostgreSQLRedisDockerKubernetes
$

Systems & Architecture

Backend ArchitectureDistributed Systems
04AI Expertise

AI engineering is not calling an LLM API.

Real AI engineering is everything around the model call — the parts that decide whether a system can be trusted in production.

01

Reliable systems

Retries, idempotency, and graceful degradation so agents fail safe, not silently.

02

Observability

Tracing every agent step, tool call, and token — you cannot operate what you cannot see.

03

Evaluation

Offline and online evals that gate releases on quality, not gut feel.

04

Memory

Short- and long-term memory that is scoped, expirable, and auditable.

05

Guardrails

Input/output validation, policy enforcement, and containment of unsafe actions.

06

Human approval

Human-in-the-loop checkpoints for high-stakes or irreversible operations.

07

Tool orchestration

Safe, typed tool calling with scoping, timeouts, and clear contracts.

08

Secure AI platforms

Authn/z, secrets, tenancy isolation, and prompt-injection defense.

09

Cost optimization

Model routing, caching, and budgeting that keep unit economics sane.

10

Production deployment

Versioning, canaries, and rollbacks for models and prompts alike.

11

Enterprise architecture

Systems that fit into existing platforms, compliance, and scale.

05Featured Projects

Selected work.

Production-minded systems where AI, backend, and infrastructure meet.

AIC — AI Incident Commander

Building

Autonomous incident investigation, remediation & recovery

An Agentic AI platform that autonomously investigates production incidents — pulling context from Kubernetes, Prometheus, Grafana, Datadog, and CloudWatch, running root-cause analysis, and recommending remediation. Executes approved fixes behind human-in-the-loop guardrails, verifies recovery, and writes the post-incident record. Built on durable workflows with full traceability and LLMOps.

PythonFastAPILangGraphTemporalKubernetespgvector
01

Multi-Agent Orchestration Engine

In design

Platform for coordinating fleets of autonomous agents

An AI platform — not an app — for registering, deploying, and orchestrating custom agents at scale. Handles workflow orchestration, agent lifecycle, shared state and memory, checkpointing, retries, human approvals, and event-driven tool execution across distributed workers, exposed through a typed API and SDK.

PythonLangGraphTemporalFastAPIRedis StreamsPostgreSQL
02

MCP Tool Gateway

In design

Secure enterprise tool access for AI agents over MCP

An infrastructure gateway that exposes internal enterprise tools to AI agents via the Model Context Protocol — with tool discovery, OAuth2/JWT auth, policy-based authorization, per-tenant rate limiting, secrets management, versioning, and full audit logging. Designed as shared infrastructure across many AI applications and MCP servers.

PythonFastAPIMCPOAuth2PostgreSQLRedis
03
06Engineering Philosophy

Simplicity is the result of deep understanding.

Elegant software comes from simplicity, deep understanding, and thoughtful system design. I learn by building real products, not toy examples.

What I value

  • Clean architecture
  • Scalability
  • Reliability
  • Security
  • Performance
  • Maintainability
  • Developer experience
07Experience

9+ years of building.

From backend systems and distributed architecture to production Agentic AI.

Senior Software Engineer / Agentic AI Engineer

Omnicom Media Group · Aug 2022 — Present

Building production Agentic AI and AI platform foundations — multi-agent systems, LLM orchestration, evaluation, observability, and enterprise AI infrastructure on top of Python, FastAPI, and Kubernetes.

Agentic AIFastAPIKubernetesLLMOps

Technical Lead

HCL · Aug 2021 — Aug 2022

Led backend engineering and system design — owning architecture decisions, mentoring engineers, and shipping reliable, scalable services.

Backend ArchitectureSystem DesignLeadership

Software Engineer

Tata Consultancy Services (TCS) · Mar 2017 — Aug 2021

Built backend systems and APIs for enterprise clients — the foundation in distributed systems, data, and production software craftsmanship that the rest of my career is built on.

PythonBackendDistributed SystemsAPIs
08Recommendations

What people I've built with say.

Real LinkedIn recommendations from managers, product leaders, and engineers I've shipped production systems with — grouped by the strengths they speak to.

Production backend reliability
A technically strong backend engineer — he's great at figuring out solutions to tough challenges and able to implement them.
GHGhulam HabibLead Software Engineer, Annalect
An exceptional backend engineer with deep expertise in Python, Golang, APIs, and databases — he simplifies complex problems with a structured approach and delivers high-quality solutions that truly impact business outcomes.
DKDinesh Kumar Chowdary VunnamSenior Test Lead, HCL
One of the most technically sharp and reliable professionals I've come across. His expertise in FastAPI, Python, and PostgreSQL is outstanding.
SVSai Vasudev GandamCertified Scrum Product Owner
AI & ML depth
I've leaned on him countless times when exploring the feasibility of incorporating AI into our products. He'd do the research, find the solution, and come back with something actionable.
HSHassan SarkerSenior Product Manager, American Express
His expertise in Python, FastAPI, and backend architecture is complemented by his growing work in AI engineering — AI agents, RAG, and MCP-based systems — with a forward-looking approach to AI-powered platforms.
RNRamakishore NoojiAI-Driven Backend Engineer, Omnicom
Mentorship & collaboration
An exceptional teammate — I could always count on him to be a reliable backup, especially during urgent debugging sessions or unexpected emergencies.
AUAman UpadhyayLead Software Engineer · GenAI, LLMs, RAG
He's always learning to keep up with what the project needs, and he's happy to help train others — great at working across teams to keep everything running smoothly.
CTChaitra ThimmaiahLead — AI & Data Automation
A fantastic partner to the product team — he takes the time to understand the “why” behind our requests and proposes brilliant, scalable solutions that anticipate future needs.
RPRonak ParikhAdTech / MarTech Leader
09Writing

Notes on building production AI.

Essays on the engineering behind reliable agentic systems. First posts landing soon.

Why AI engineering is not calling an LLM API

Reliability, observability, evaluation, and guardrails — the parts that decide whether an agent survives production.

Coming soon

Designing memory for multi-agent systems

Scoping, expiry, and auditability — making agent memory an asset instead of a liability.

Coming soon

An MCP gateway pattern for enterprise tools

Exposing internal tools to agents safely, with scoping, auditing, and rate control.

Coming soon

10 — Contact

Let's build production AI.

Open to forward-deployed AI engineering, AI platform, and enterprise AI infrastructure work. If you're putting agents into production, let's talk.