Parameter Golf
Parameter Golf is the OpenAI challenge used in this writeup to demonstrate agentic research with elastic compute.
Task framing
The challenge asks participants to compress a language model into a ≤16 MB artifact that runs on 8×H100 in under 10 minutes while minimizing bits-per-byte.
Why it is a good demo
The benchmark stresses multiple phases of work:
- pipeline smoke tests
- broad hyperparameter sweeps
- full-scale validation
- post-training bottleneck debugging
- final optimization rounds
That makes it a strong example for modal and for the broader idea of agentic-research-autoscaling.
Reported outcome
The showcased agent ran 113 experiments across 238 GPU-hours, with the core training finishing about 5× faster than it would on a single workstation.