ChatGPT now interprets photos better than an art critic and an investigator combined

The recent image generation capabilities of Chatgpt have challenged our previous understanding of AI-generated media. The recently announced GPT-4O model displays remarkable abilities to explain images with high accuracy and re-create them with viral impacts, such as inspired by studios. Even it mastery the text in AI-borne images, which was difficult for AI earlier. And now, it is launching two new models, which is able to dissect images for cues to gather more information that can also fail a human look.

Openai announced two new models earlier this week, which takes chatgip’s thinking abilities to a rung. Its new O3 model, which Openai calls its “most powerful logic model”, improves existing interpretation and perception capabilities, is getting better in “coding, mathematics, science, visual perception, and more”, the organization claims. Meanwhile, O4-Mini is a small and sharp model for “cost-skilled arguments” in the same way. The news follows the recent launch of the GPT-4.1 square model of Openai, which brings rapid processing and deep references.

Chatgpt is now “thinking with images”

With improving their abilities for logic, both models can now include images in their logic process, which enables them to “think with images”, Openai announcement. With this change, both models can integrate images in the series of ideas. Going beyond the basic analysis of images, O3 and O4-Mini models can check images more closely and even manipulate them through tasks such as crop, zooming, flipping, or enrichment, which can provide details to obtain any visual signs from images that can improve the potential to provide potential solutions.

Introduction to OPENAI O3 and O4-Mini-the smartest and most capable model ever.
For the first time, we can use and combine every tool within the chat including our region model, web search, python, image analysis, file interpretation and image generation. pic.twitter.com/RDAQV0X0WE
– Openai (@OPENAI) 16 April, 2025

With the announcement, it is said that models mix visual and textual arguments, which can be integrated with other chatgpt features such as web search, data analysis and code generation, and it is expected that they form the basis for more advanced AI agents with multimodal analysis.

In other practical applications, you can expect a crowd of objects, to include a congestion of such flow charts or scribe from handwritten notes to the images of real -world objects, and a chatgip is a deeper understanding for a better output, even without a descriptive text Prompt. With this, Openai is close to Gemini of Google, which provides effective ability to interpret the real world through live videos.

Despite the bold claims, Openai is only limiting access to paid members, possibly to prevent its GPU from “melting” again, as it struggles to maintain the demand for calculation for new arguments. So far, O3, O4-Mini, and O4-Mini-HIGH models will be specially available for Chatgpt Plus, Pro, and team members, while enterprise and education tier users receive it in a week’s time. Meanwhile, free users will be able to limit limited access to O4-Mins when they select the “Think” button in the Prompt bar.

What's Hot

Elon Musk will face consequences if he backs Democrats

R-Truth returns to punish John Cena, here’s how ‘you’ made it happen

Man who let snakes bite him 200 times spurs new antivenom hope

Resident Evil 9 returns to Raccoon City, coming next February

Dropbox CEO slams return-to-office mandates, compares them to outdated malls and theaters

Will Musk’s explosive row with Trump help or harm his businesses?

Stability trend for private markets to see in 2025

Appeals court allows Trump to enforce ban on DEI programs for now

My mom says these Sony headphones (down to $38) are the best gift I’ve given her

Most Popular