Tinker runs on remote GPUs. Every API call involves network latency plus GPU compute time. If you send sampling requests one at a time -- send, wait, send, wait -- you spend most of your time idle ...
This tutorial demonstrates how to build batch-processing RPC applications with the @rpc.functions.async_execution decorator, which helps to speed up training by reducing the number of blocked RPC ...