CliffSearch: Structured Agentic Evolution for Scientific Algorithm Discovery

CliffSearch evolves structured research artifacts in either theory+code or code+design-intent mode, with an in-loop runtime harness, benchmark fitness, reviewer gates (correctness/originality), and role-specialized LLM operators for crossover, exploration mutation, and correction mutation.

Population-based Benchmark-grounded Runtime-harness-gated Reviewer-gated Multi-task

Questions/feedback: mroueh@us.ibm.com

See paper on arXiv Code on GitHub (coming soon)

How It Works

In theory+code mode, CliffSearch co-evolves theory_content, code_content, and summary_md together. On this website, we refer to the config mode code_only as code+design-intent: formal theory_content is omitted, but summary_md still carries the design and ideation principles behind the artifact.

Crossover Combine successful artifacts
Exploration Mutation Import ideas from adjacent domains
Correction Mutation Evidence-guided repair

Abstract

Scientific algorithm discovery is iterative: hypotheses are proposed, implemented, stress-tested, and revised. Recent LLM-guided search systems accelerate proposal generation, but often optimize code alone with weak scientific gating. We present CliffSearch, an agentic evolutionary framework in which LLM agents perform pairing, crossover, mutation, and review over structured scientific artifacts. CliffSearch is built around three ideas: nodes can carry executable code together with design intent and, when available, explicit mathematical grounding; a task-specific runtime harness already performs an in-loop runtime audit alongside reviewer judgments of correctness and originality and benchmark score; and mutation is split into exploration and repair pathways. We evaluate CliffSearch on transformer hyper-connection discovery and optimizer discovery on a fixed nanoGPT stack. Across these settings, the same loop surfaces non-trivial geometric hyper-connection families, optimizer variants, and reviewer-guided repair trajectories under controlled evaluation. Full run artifacts, interactive visualizations, and exported best nodes for the reported studies are available at cliffsearch.ai.

What Is On This Site

Framework

Core loop, runtime harness, agent roles, runtime orchestration, and island migration mechanism.

Agents Prompts

Representative prompt examples from short_json and workflow_v2.

Tasks & Best Nodes

Interactive selector by task/provider/prompt bundle and full best-node code/theory artifacts.

Preprint Preview (Full PDF)

Open PDF

This browser did not render the inline PDF preview.

Citation

@article{cliffsearch2026preprint,
  title   = {CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery},
  author  = {Youssef Mroueh and Carlos Fonseca and Brian Belgodere and David Cox},
  journal = {arXiv preprint arXiv:2604.01210},
  year    = {2026},
  url     = {https://arxiv.org/abs/2604.01210}
}