The GPT-4 barrier has finally been broken
Four weeks ago, GPT-4 remained the undisputed champion: consistently at the top of every key benchmark, but more importantly the clear winner in terms of “vibes”. Almost everyone investing serious …
Hasnain says:
Excited to try this out. Some of the new fuzzing results people posted today are mind blowing.
“Claude 3 Opus, March 4th. This is just a few days old and wow: the vibes on this one are really strong. People I know who evaluate LLMs closely are rating it as the first clear GPT-4 beater. I’ve switched to it as my default model for a bunch of things, most conclusively for code—I’ve had several experiences recently where a complex GPT-4 prompt that produced broken JavaScript gave me a perfect working answer when run through Opus instead (recent example). I also enjoyed Anthropic research engineer Amanda Askell’s detailed breakdown of their system prompt”
Posted on 2024-03-09T07:39:55+0000