Agentic Research Autoscaling

Agentic Research Autoscaling is the idea that AI research agents should be able to change both the amount and mode of compute they use as the work changes.

Thesis

The source argues that research is neither steady-state nor single-mode. A productive agent may need:

  • cheap single-GPU exploration
  • highly parallel sweeps
  • multi-GPU validation jobs
  • interactive debugging environments
  • rapid scale-to-zero after completion

Why fixed infrastructure is a poor fit

A workstation is cheap but serial. A reserved cluster is powerful but wasteful during idle or low-parallelism phases. The source presents this as a false trade-off when infrastructure can adapt dynamically.

Parameter Golf timeline in this source

  1. Smoke-test the pipeline on small runs
  2. Fan out single-GPU exploration
  3. Scale to 5 parallel 8×H100 validation runs
  4. Scale back down to debug GPTQ bottlenecks
  5. Scale back up for final optimization

Main takeaway

The infrastructure layer should follow the experiment graph. In this writeup, modal is the mechanism, autoresearch is the workload pattern, and parameter-golf is the benchmark proving ground.

Relation to AutoLab

autolab frames the same general phenomenon from the opposite angle: instead of focusing on infrastructure that lets research agents scale up and down, it focuses on benchmark design that measures whether agents can stay productive inside those empirical loops at all.