Misinformation 📅 April 11, 2026

AI Models Struggle with Real-World Betting

The study reveals significant shortcomings of AI models in real-world applications, particularly in betting scenarios. It emphasizes the gap between AI's capabilities and its performance in dynamic environments.

A recent study by General Reasoning has revealed that advanced AI models from companies like Google, OpenAI, and Anthropic struggle significantly when applied to real-world scenarios, such as betting on soccer matches. The 'KellyBench' report tested eight leading AI systems in a simulated Premier League season, where they were tasked with placing bets based on historical data. The results were disappointing; every AI model lost money over the season, with xAI's Grok going bankrupt in one attempt and failing to complete others. This highlights a critical gap between AI's capabilities in controlled environments and its performance in dynamic, unpredictable situations. The findings challenge the prevailing narrative of AI's potential to automate complex tasks and suggest that many benchmarks used to evaluate AI are inadequate, as they often occur in static settings that do not reflect the complexities of real-world applications. The study serves as a reminder that while AI can excel in certain areas, its limitations must be acknowledged, especially in tasks requiring long-term strategic thinking and adaptability.

Why This Matters

This article matters because it exposes the limitations of AI in practical applications, particularly in high-stakes environments like betting. Understanding these risks is crucial as society increasingly relies on AI for decision-making. The findings challenge the narrative of AI's infallibility and highlight the need for more realistic benchmarks in AI testing. Acknowledging these shortcomings can help mitigate potential negative impacts on industries and individuals relying on AI.

Original Source

Why This Matters

Original Source

AI models are terrible at betting on soccer—especially xAI Grok

Type of Company

Topic

Privacy Preference