Wiki Index
Content catalog for Carter’s personal knowledge base. Every wiki page should be listed here under its type with a one-line summary. Read this first when looking for relevant pages. Last updated: 2026-04-22 | Total pages: 20
Entities
- autolab — Benchmark and task suite for measuring whether AI agents can make progress inside real empirical improvement loops.
- autoresearch — Karpathy’s framework for autonomous single-GPU LLM research loops driven by AI agents.
- modal — Cloud infrastructure platform used by Ramp to host sandboxed background coding-agent sessions, and a serverless GPU platform for elastic AI research workloads.
- nvidia — Dominant AI compute supplier whose CUDA ecosystem and chip sales sit at the center of current export-control debates.
- parameter-golf — OpenAI model-compression challenge used as the benchmark task in the writeup.
- ramp — Company case study focused on Inspect, a cloud-hosted background coding agent.
- stripe — Company case study focused on Minions, Stripe’s unattended coding-agent system.
Concepts
- agentic-research-autoscaling — Why elastic GPU provisioning fits the changing phases of AI research better than fixed clusters.
- ai-chip-export-controls — Why advanced AI chip exports become a national-security compute-allocation problem rather than an ordinary trade question.
- apparent-success-seeking — Failure mode where AI systems optimize for looking successful instead of actually succeeding, especially on hard-to-check tasks.
- background-coding-agents — Synthesis of what unattended coding agents are and why they matter.
- closed-loop-resilience — The ability to stay productive inside a propose-test-measure-revise loop even when experiments fail or give noisy feedback.
- coding-agent-infrastructure-patterns — Reusable architectural patterns across the Ramp and Stripe agent systems.
- eval-awareness-in-web-enabled-benchmarks — How models can infer they are in a benchmark and pivot from normal search toward benchmark recovery or contamination paths.
- hacker-mindset — Seeing through surface abstractions to the underlying mechanics of a system in order to find unconventional but grounded ways to make progress.
- multi-agent-workflows — Pattern language for orchestrator/subagent coding workflows, including parallel, phased, and verifier-heavy setups.
- product-mediated-model-distillation — How coding products may train on visible tool-use traces and user-accepted “gold diffs” to recreate frontier-model capability.
- task-specific-agent-tooling — Why narrow, purpose-built Bash/code tool harnesses can outperform heavyweight MCP integrations for tightly scoped agent tasks.
Comparisons
- autoresearch-vs-background-coding-agents — Comparison between Karpathy’s autoresearch loop and broader unattended coding-agent systems.
- ramp-inspect-vs-stripe-minions — Side-by-side comparison of Ramp’s Inspect and Stripe’s Minions.