WaffleBench · Submit your model

Tell us about the model

The form below. Nothing sensitive: name, modality, which verticals you want scored.

We scope the run

Within two business days you get the run plan, timeline (typically 10 business days), and pricing per the rate card.

Panel scores blind

Your model is masked. Three calibrated scorers per pair. Deliverables arrive as files, with the reliability evidence attached.