{"id":3697,"date":"2025-07-15T22:17:27","date_gmt":"2025-07-15T22:17:27","guid":{"rendered":"https:\/\/musictechohio.online\/site\/grok-4-ai-leaderboard\/"},"modified":"2025-07-15T22:17:27","modified_gmt":"2025-07-15T22:17:27","slug":"grok-4-ai-leaderboard","status":"publish","type":"post","link":"https:\/\/musictechohio.online\/site\/grok-4-ai-leaderboard\/","title":{"rendered":"Elon Musk Said Grok 4 Was the &#8220;Smartest AI in the World,&#8221; But Its Leaderboard Scores Just Came Out and They Tell a Different Story"},"content":{"rendered":"<div>\n<div><img width=\"2400\" height=\"1260\" src=\"https:\/\/wordpress-assets.futurism.com\/2025\/07\/grok-4-ai-leaderboard.jpg\" class=\"attachment-full size-full wp-post-image\" alt=\"Elon Musk has been boasting about the capabilities of Grok 4, but new findings suggest that it doesn't match up to its competitors.\u00a0\u00a0\" style=\"margin-bottom: 15px;\" decoding=\"async\" fetchpriority=\"high\"><\/div>\n<p>Elon Musk has been boasting about <a href=\"https:\/\/futurism.com\/elon-musk-grok-4-power-hitler\">what he says<\/a> are the incredible capabilities of xAI&#8217;s new Grok 4 AI chatbot.<\/p>\n<p>&#8220;Grok 4 is smarter than almost all graduate students in all disciplines, simultaneously,&#8221; Musk bragged, adding that Grok 4 was &#8220;the smartest AI in the world.&#8221;<\/p>\n<p>Is it really? Intelligence was a hard thing to measure even before back before AI hit the scene, but certain tests can provide something of a clue.<\/p>\n<p>One prominent platform for doing so is the UC Berkeley-developed\u00a0<a href=\"https:\/\/lmarena.ai\/leaderboard\">LMArena leaderboard<\/a>, which crowdsources rankings on AI models by\u00a0having users score their responses in categories ranging from creative writing and coding to math and vision.<\/p>\n<p>In its latest scores, Grok 4 ranked third place overall and on text generation.\u00a0Make no mistake, that&#8217;s impressive \u2014 but it&#8217;s still trailing behind advanced models from Google and OpenAI. (Specifically, Google&#8217;s Gemini 2.5 placed first and OpenAI&#8217;s o3 and 4o reasoning models tied for second, with GPT-4.5 tied with Grok 4 for third.)<\/p>\n<p>While\u00a0Grok is clearly a fearsome competitor in the <a href=\"https:\/\/futurism.com\/grok-mocks-developers-racist-posts\">arenas of racism and antisemitism<\/a>, in other words, even its latest release clearly falls short of being the &#8220;smartest AI in the world.&#8221;\u00a0(This isn&#8217;t entirely surprising; Musk has a long history of fibbing in his <a href=\"https:\/\/futurism.com\/elon-musk-openai-web-lies\">professional life<\/a>, <a href=\"https:\/\/www.bbc.com\/news\/articles\/cwyjz24ne85o\">political activities<\/a>, and <a href=\"https:\/\/fortune.com\/2025\/01\/20\/elon-musk-video-games-scandal-path-of-exile-asmongold-quin\/\">even his hobbies<\/a>.)<\/p>\n<p>Perhaps the only saving grace for Grok is the suggestion, per expert criticism, that Berkeley&#8217;s chatbot arena may be more vibes-based than strictly scientific.<\/p>\n<p>According to a <a href=\"https:\/\/arxiv.org\/abs\/2504.20879\">recent study<\/a>, conducted by a consortium of AI researchers and led by the machine learning firm Cohere, the leaderboard allegedly has a bunch of &#8220;systematic issues that have resulted in a distorted playing field.&#8221; Among the serious allegations raised by the researchers is the claim that the arena conducts &#8220;undisclosed private testing&#8221; before publicly releasing scores \u2014 and that rankings can be retracted at will.<\/p>\n<p>Soon after the paper&#8217;s release, it <a href=\"https:\/\/simonwillison.net\/2025\/Apr\/30\/criticism-of-the-chatbot-arena\/\">was revealed<\/a> that the version of Meta&#8217;s LLaMA 4 that had been used by the leaderboard wasn&#8217;t the same one that had been released publicly \u2014 a bait-and-switch ploy on Meta&#8217;s part to charm the human voters behind the arena.<\/p>\n<p>Though an <a href=\"https:\/\/x.com\/lmarena_ai\/status\/1909397817434816562\">apology was issued<\/a> and Meta was thrown under the bus for its sketchy attempts to rig the game, it was still a really bad look that marred the chatbot arena&#8217;s credibility.\u00a0What that means for Grok, though? We&#8217;ll have to ask the smartest AI in the world.<\/p>\n<p><strong>More on Grok:<\/strong> <a href=\"https:\/\/futurism.com\/pentagon-elon-musk-xai-grok\"><em>The Pentagon Is Pumping $200 Million Into Elon Musk&#8217;s AI That Just Had a Nazi Meltdown<\/em><\/a><\/p>\n<p>The post <a href=\"https:\/\/futurism.com\/grok-4-ai-leaderboard\">Elon Musk Said Grok 4 Was the &#8220;Smartest AI in the World,&#8221; But Its Leaderboard Scores Just Came Out and They Tell a Different Story<\/a> appeared first on <a href=\"https:\/\/futurism.com\/\">Futurism<\/a>.<\/p>\n<\/div>\n<div style=\"margin-top: 0px; margin-bottom: 0px;\" class=\"sharethis-inline-share-buttons\" ><\/div>","protected":false},"excerpt":{"rendered":"<p>Elon Musk has been boasting about what he says are the incredible capabilities of xAI&#8217;s new Grok 4 AI chatbot. &#8220;Grok 4 is smarter than almost all graduate students in&hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2674,177,2675,178,184],"tags":[],"class_list":["post-3697","post","type-post","status-publish","format-standard","hentry","category-ai-leaderboard","category-artificial-intelligence","category-chatbot-arena","category-elon-musk","category-grok"],"_links":{"self":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts\/3697","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/comments?post=3697"}],"version-history":[{"count":0,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/posts\/3697\/revisions"}],"wp:attachment":[{"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/media?parent=3697"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/categories?post=3697"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/musictechohio.online\/site\/wp-json\/wp\/v2\/tags?post=3697"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}