$ cat ~/projects/rust-ml-inference-engine.md

Python-to-Rust ML Inference Engine

internal

#Rust#Async#ML Inference#PyO3#Systems Programming

Rewrote the async ML inference server from Python to Rust for performance-critical robotics applications — with Python bindings for interoperability.

Converted the production async ML inference server from Python to Rust to meet the latency and throughput requirements of real-time robot control. The Rust implementation provides significant performance improvements while maintaining Python interoperability through PyO3 bindings, allowing ML researchers to continue using Python for model development.

// key_highlights

▸Async Rust server using Tokio for high-concurrency inference serving
▸PyO3 Python bindings for seamless interoperability with PyTorch models
▸Significant latency reduction compared to the Python implementation
▸Zero-copy data transfer for tensor and image data
▸Production-grade error handling and connection management for robot systems

This is proprietary work from my role at Agile Robots SE. Source code is not publicly available, but the write-up above describes the architecture and technical approach.