Elon Musk’s AI venture, xAI has released an early preview of the Grok 2 model, and it has surprisingly outperformed Claude, Gemini, and even ChatGPT as well. The earlier Grok-1.5 model was not received well, but Grok-2 has delivered great performance on the LMSYS leaderboard. xAI has released two new models: Grok-2 and a smaller Grok-2 mini model.
xAI says Grok-2 has been significantly improved in key areas including reasoning, instruction following, and providing accurate and factual information. In traditional AI benchmarks, Grok-2 has scored a whopping 87.5% in MMLU and 88.4% in HumanEval. This is particularly interesting because the MMLU score has been derived using 0-shot CoT.
Grok-2 was tested on LMSYS under the name “sus-column-r”. With around 12,000 votes, it stands at the third position, just below ChatGPT-4o-latest, Gemini-1.5-Pro-Experimental, and GPT-40-2024-05-13. However, it performs better than GPT-4o-mini, Claude 3.5 Sonnet, Gemini 1.5 Pro, and Llama 3.1 405B.
In coding and math-related tasks, Grok-2 takes the 2nd spot, and in hard prompts, it takes the 4th position. xAI says that the Grok-2 multimodal model will be released soon. The company has not revealed the parameter size for both models. You can start using the new Grok-2 model on x.com and developers can get started with the API as well.
“Go F**** Yourself”: Band Slams Rockstar For Insulting GTA 6 Song Rights Offer
How To Mass Unfollow Users On Instagram
Mortal Kombat 1: Quan Chi Moves And Combos Guide