Anthropic introduces Claude 3.5 Sonnet, matching GPT-4o on benchmarks

Looking ahead, Anthropic plans to release Claude 3.5 Haiku and Claude 3.5 Opus later in 2024, completing the 3.5 model family. The company is also exploring new features and integrations with enterprise applications for future updates to the Claude AI platform.

The trouble with LLM naming

When we first heard about Claude 3.5 Sonnet, we were a little confused, because “Sonnet” was already released in March—or so we thought. But it turns out it’s the number “3.5” that is the most important part of Anthropic’s new branding here.

Anthropic’s naming scheme is slightly confusing, inverting the expectation that the version number might be at the end of a software brand name, like “Windows 11.” In this case, “Claude” is the brand name, “3.5” is the version number, and “Sonnet” is a custom modifier. Introduced with Claude 3 in March, Anthropic’s “Haiku,” “Sonnet,” and “Opus” appear to be synonyms for “small,” “medium,” and “large,” much in the same way Starbucks uses “Tall,” “Grande,” and “Venti” for its branded coffee cup sizes.

Large language models are still relatively new, and the companies that provide them have been experimenting with naming and branding as they go along. The industry has not yet settled on a format that lets users quickly understand and judge relative capabilities across brands if one is familiar with one company’s naming scheme but not another’s.

With a string of major releases like GPT-3, GPT-3.5, GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, and GTP-4o (although each one has had sub-versions), OpenAI has arguably been the most logically consistent in naming its AI models so far. Google has its own muddled naming issues with Gemini Nano and Gemini Pro, then Gemini Ultra 1.0, and most recently Gemini Pro 1.5. Meta uses names like Llama 3 8B and Llama 3 70B, with a brand name, version number, then a size number in parameters. Mistral uses parameter size names similar to Meta but with an array of model names that include Mistral (the company’s name), Mixtral, and Codestral.

If it all sounds confusing, that’s because it is—and the generative AI industry is so new that no one really knows what they’re doing yet. Presuming that useful mainstream applications of LLMs eventually emerge, we may begin hearing more about those apps and less about the strangely named models under the hood.