Skip to content

Method Comparison

Head-to-head benchmark of MALMAS, CAAFE, LLM-FE, and Malmus on the same dataset with side-by-side metrics.

Run Locally

export FF_LLM__API_KEY=sk-...
uv run quarto render notebooks/10_method_comparison.qmd