PILA Trains on Screen Input to Play PolyTrack

Developer tryfonaskam released PILA (PolyTrack Imitation Learning AI), an open-source agent that learns to play the browser racing game PolyTrack by observing screen captures and recorded human keyboard inputs. Implemented in PyTorch (Python 3.11) and released under Apache 2.0 on GitHub, the pipeline records player controls alongside corresponding game frames, trains a supervised neural network on those state-action pairs, and runs real-time inference to issue keyboard commands from live frames. Reported by Hackaday on June 28, 2026, the project uses a single-frame architecture as a deliberate starting point for screen-based imitation learning, which avoids simulator state instrumentation, reward engineering, and environment wrappers. The article notes that behavioral cloning from pixels typically needs temporal context to handle momentary visual ambiguity, and generalization degrades when held-out tracks differ visually from training data.