Syntheva Robotics Blog

AI inference

Everything related to AI inference on the edge

  • Posted on

    Affordable robotic systems often have access to inexpensive CPU resources. What affordable robots often do not have is a large, expensive, and power-hungry GPU budget. That motivated us to build SARA [Sharded Activation Reduction Architecture]: a distributed inference path that uses multiple CPUs to reduce single-stream token latency.