▲ /asking
what i'm trying to figure out
real open questions, each on its own page. answers are public and threaded inline. if you've thought about one of these harder than i have, i want to hear it.
- /asking/evals-that-matter
which evals would actually tell you whether a frontier model can do real capture-grade work
- /asking/multi-phase-coherence
how to keep an agent's state coherent across multi-phase analysis without losing the thread by phase three
- /asking/checker-failure-modes
what the checker should catch that it currently doesn't, read off the failures the agent shipped past it