$ cat ~/projects/rust-ml-inference-engine.md

Python-to-Rust ML Inference Engine

internal
#Rust#Async#ML Inference#PyO3#Systems Programming

Rewrote the async ML inference server from Python to Rust for performance-critical robotics applications — with Python bindings for interoperability.

Converted the production async ML inference server from Python to Rust to meet the latency and throughput requirements of real-time robot control. The Rust implementation provides significant performance improvements while maintaining Python interoperability through PyO3 bindings, allowing ML researchers to continue using Python for model development.

// key_highlights

  • Async Rust server using Tokio for high-concurrency inference serving
  • PyO3 Python bindings for seamless interoperability with PyTorch models
  • Significant latency reduction compared to the Python implementation
  • Zero-copy data transfer for tensor and image data
  • Production-grade error handling and connection management for robot systems

This is proprietary work from my role at Agile Robots SE. Source code is not publicly available, but the write-up above describes the architecture and technical approach.