Reward Hacking in LLM-Guided Evolutionary Search
A run-level case study of one transformer search: the exact task preamble, the runtime harness, what the reviewer correctly caught, how originality was audited, and why a low-loss branch later failed causal checks.