Anthropic announced its latest generative AI model, Claude 3.5 Sonnet, in a blog post today, and claimed it was “best in class.”
On paper, the model outperforms Claude 3 Sonnet and Claude 3 Opus, Anthropic’s previous flagship model, across benchmarks for reading, math, vision, and coding. We can’t always rely on benchmarks to measure the performance of AI models, though, as they often test for edge cases that don’t reflect how the average person uses the model.
The new LLM, which can generate text as well as analyze images and text, is apparently twice as fast as 3 Opus — good news for devs building customer service chatbots and other apps where speedy responses are key.
It’s also better than 3 Opus at analyzing images. According to Anthropic, it can more accurately interpret data from graphs, charts, and infographics, even transcribing text in images with visual artifacts and distortions.
Anthropic claims that 3.5 Sonnet solved 64% of problems in an internal agentic coding evaluation, compared to Claude 3 Opus’ score of 38%, and is “raising the industry bar for intelligence.”
AI has never been great at understanding humor, but Anthropic promises its latest model shows improvements here too, and it also has a better understanding of nuanced and complex instructions.
Anthropic’s Models Aren’t Trained on Customer Data
These latest improvements come from new training data and architectural tweaks. Anthropic’s product lead, Michael Gerstenhaber, stayed silent on exactly what training data was used for the latest model but the company has not used any customer or user-submitted data to train its models so far. It has a policy of never training its generative models on user-submitted data without explicit permission.
Alongside 3.5 Sonnet, Anthropic is releasing Artifacts. When users ask Sonnet to create content such as text or snippets of code, a window will appear next to the chat with the created Artifacts. This allows users to view and edit the created content, fostering the use of AI in a more collaborative environment. Artifacts is currently in preview, but the company has promised more collaborative features in the future, including ways to collaborate with larger teams.
Claude 3.5 Sonnet is available now for free on Claude.ai and the Claude iOS app. The first release in the Claude 3.5 model family will be followed later this year by 3.5 Haiku and 3.5 Opus.
This isn’t a huge leap forward for Anthropic, more of an incremental step, but it’s still an important step that sees the company competing with OpenAI’s ChatGPT, Google Gemini, and other rivals in the generative AI space.