Pavan Kumar Kandapagari — Foundation Models & Robotics

pavan@robotics:~$ whoami

// Team Lead — Foundation Models for Robotics @ Agile Robots SE, Munich

> I'm an AI engineer working on multimodal machine learning and robotics. My work focuses on vision-language and vision-action models that connect perception with decision making. I enjoy building complete ML systems — from large-scale model training to real-time inference infrastructure and robotics integration.

./view-projects wget resume.pdf

~/stats.log

01Years at Agile Robots0+

02Original Projects0+

03Patent + Publication + Book0+

~/library/action-models.md

new// featured book

$ Action Models for Robot Learning

Open-access textbook on action models, vision-language-action systems, and modern robot learning — written and shipped as a static site with math, code highlighting, search, and AI-narrated audio.

./read-the-book ./cite --learn-more

── EXPERIENCE ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

$ cat experience.log

[view all]

●

Team Lead — Foundation Models for Robotics

@ Agile Robots SE

Mar 2026 → Present

●

Tech Lead — Foundation Models for Intelligent Agents

@ Agile Robots SE

Aug 2025 → Mar 2026

●

Senior Deep Learning Engineer

@ Agile Robots SE

Dec 2023 → Sep 2025

── PROJECTS ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

$ ls -la projects/ --featured

[view all]

~/projects/action-models-book

$ Action Models for Robot Learning

Open-access textbook on action models, vision-language-action systems, and modern robot learning — written and shipped as a static site with math, code highlighting, search, and AI-narrated audio.

#Book#Robotics#VLA#Action Models#Open Access#Vision-Language-Action

~/projects/vlm-roboticsprivate

$ Vision-Language Model for Robotics

Training a VLM for robot understanding — multi-GPU pipeline with Hugging Face models, designed for real-world robotic perception and instruction following.

#VLMs#PyTorch#Hugging Face#Distributed Training#Robotics

~/projects/vision-action-modelprivate

$ Vision-Action Model for Robot Control

From perception to manipulation — a Vision-Language-Action model that converts visual understanding into precise robot actions for real-world manipulation tasks.

#VLA Models#Imitation Learning#PyTorch#Robotics#Simulation

~/projects/async-ml-inferenceprivate

$ Async ML Inference System

Real-time ML inference using asynchronous WebSocket architecture — streaming multimodal data (images, tensors, text) for low-latency robot control.

#WebSocket#Async Python#ML Inference#Robotics#Systems

~/projects/rust-ml-inference-engineprivate

$ Python-to-Rust ML Inference Engine

Rewrote the async ML inference server from Python to Rust for performance-critical robotics applications — with Python bindings for interoperability.

#Rust#Async#ML Inference#PyO3#Systems Programming

~/projects/robotic-transformer-1

$ Robotic Transformer 1

Implementation of RT-1 architecture for robotic manipulation — transformer-based policy for real-world task execution from demonstrations.

#PyTorch#Transformers#Robotics#Imitation Learning

~/projects/finetune-llms

$ Finetune LLMs

Scripts and pipelines for fine-tuning large language models with LoRA and QLoRA on custom datasets.

#LLMs#LoRA#Fine-tuning#Transformers

── SKILLS ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

$ man skills

[view all]

# Machine Learning & AI

$Vision-Language Models (VLMs)$Multimodal Learning$Foundation Models$Transformers$Model Training & Fine-Tuning$Model Evaluation & Benchmarking$GANs$Transfer Learning$Knowledge Distillation$Semi-Supervised Learning$RAG$Dataset Preparation & Preprocessing

# Robotics AI

$Vision-Action Models$Imitation Learning$Robot Action Prediction$Vision-Based Manipulation$Simulation-Based Evaluation$Policy Inference Pipelines$Object Detection & Tracking$Semantic Segmentation

# Systems & Infrastructure

$Async ML Inference$WebSocket Architectures$GPU Training Pipelines$Multi-GPU / Distributed Training$AWS SageMaker / HyperPod$Docker$Kubernetes$TensorRT$ONNX$CUDA$GitLab CI/CD$MLflow$Linux

# Programming

$Python$Rust$C++$PyTorch$Hugging Face$LangChain$OpenCV$SQL$Bash

── PUBLICATIONS ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

$ cat refs.bib

@book2026

Action Models for Robot Learning

P. Kandapagari

Open-access online textbook — action-models-book.vercel.app

@paper2021

Tissue Segmentation in Histologic Images of Intracranial Aneurysm Wall

A. Niemann, A. Talagini, P. Kandapagari, B. Preim, S. Saalfeld

Interdisciplinary Neurosurgery, Vol. 26, 101307

// 4 citations

@patent2024

Animal Physical Parameter Estimation by Image Processing

P. Kandapagari, et al.

European Patent Application No. 23207432.8

[view all publications]

$ ./connect.sh

> Interested in collaborating on ML, robotics, or foundation models? Reach out.

./get-in-touch open linkedin