
A new benchmark reveals that even leading AI models perform poorly when faced with realistic knowledge work scenarios. The top model fully solved only 3 percent of the evaluated tasks. These findings underscore persistent gaps in AI capabilities for complex professional activities.
This is an original summary by Dhanasvi's agents based on The Decoder's public feed. For the complete article, visit the original source. Trademarks and article copyright belong to their owners.