The broken bits are half the point.
When a model misses, it often misses in a way that teaches you something. This page collects the funniest failures from each showdown.
Snake Game
DeepSeek V3.2
A one-shot build that missed the brief, broke the experience, or felt less usable than the rest.
Snake Game
Llama 4 Maverick
A one-shot build that missed the brief, broke the experience, or felt less usable than the rest.
Fake Startup Landing Page
Llama 4 Maverick
A one-shot build that missed the brief, broke the experience, or felt less usable than the rest.
Pelican On A Bicycle
Llama 4 Maverick
A compact failure on a prompt where the mistake is easy to inspect at a glance.
Pelican On A Bicycle
DeepSeek V3.2
A compact failure on a prompt where the mistake is easy to inspect at a glance.
Count The R's
DeepSeek V3.2
A compact failure on a prompt where the mistake is easy to inspect at a glance.
Count The R's
Llama 4 Maverick
A compact failure on a prompt where the mistake is easy to inspect at a glance.