OpenAI’s new AI models are suffering from hallucinations!

It is claimed that OpenAI’s newly developed AI models are hallucinating more than before. Here are the details.

April 21, 2025

OpenAI’s new generation of artificial intelligence models, introduced last week, have raised serious concerns about accuracy. It has been determined that the models named o3 and o4-mini occasionally produce unrealistic information.

OpenAI’s artificial intelligence models are experiencing accuracy problems

The term “hallucination” in the field of artificial intelligence means that models produce information that does not exist in reality or is not compatible with accuracy. This poses a significant risk, especially for knowledge-based applications.

OpenAI'ın yapay zeka modellerinde, doğruluk problemi yaşanıyor.

According to the current results shared in OpenAI’s technical documentation, the o3 model produced hallucinations at a rate of 33 percent on the information accuracy test PersonQA developed by the company. The hallucination rate of the previous generation o1 model in this test was 16 percent, and the o3-mini model was 14.8 percent.

https://en.shiftdelete.net/intel-making-changes-to-next-gen-nova-lake-s-processors/

The highest rate among the new models belongs to the o4-mini model; this model’s hallucination rate was recorded as 48 percent. OpenAI has not yet been able to determine the exact reason for this unexpected increase. The company stated that more research is needed to understand this situation.

It is stated that models that show positive performance, especially in areas such as mathematical analysis and software production, experience an imbalance in terms of information accuracy. Some experts think that the source of this situation may be the reinforcement learning process in which the model is trained.

OpenAI continues its work to optimize the performance of its new models. So what do you think about this issue? You can easily share your views with us in the comments section below.

No comments yet Write the First Comment

Write a CommentCancel