OpenAI's latest model completely shattered all possible AI benchmarking tests, exceeding PhD-level / expert human scores across virtually all domains.
Not only that, it was able to solve 25% of problems from a math data set consisting of the most difficult, highly theoretical math problems in the world that only a handful of people (literally, like 5-10 people) are even capable of solving / proving. It did this without previous knowledge of the questions or answers. These tests were done independently, by the ARC prize foundation.
This is a massive, massive step change in performance from existing frontier AI models, which are only 3-6 months old.
Most people are simply not grasping how significant this - AI (specifically ChatGPT O3) is now able to solve some the most difficult math problems known to man. It is pushing very close to boundaries of human knowledge and doing so independently.
If you're wondering why you haven't "seen" any remarkable manifestations of this in the real world, consider a metaphor: We've invented the jet engine before we invented the airplane. This technology is so powerful, so advanced, we are still grappling with how to use it. But we'll figure it out quickly and when we do, it's off to the races.