Method Comparison Head-to-head benchmark of MALMAS, CAAFE, LLM-FE, and Malmus on the same dataset with side-by-side metrics. Expand Open in New Tab Run Locally export FF_LLM__API_KEY=sk-... uv run quarto render notebooks/10_method_comparison.qmd