4 min learnNew DelhiFeb 14, 2026 04:29 PM IST
In case you use AI chatbots like ChatGPT, Gemini or Claude every single day, you might have observed that they often reply with polished and assured solutions.
Nevertheless, if you observe up with a immediate like “Are you certain?”, they typically rethink their response and supply a revised model, which can partially and even utterly contradict what they initially stated.
In case you repeat the query as soon as extra, they could backtrack once more. Whereas a few of these giant language fashions perceive that you’re testing them by the third spherical, they nonetheless received’t maintain their floor.
In a weblog submit, Dr Randal S. Olson, the co-founder and CTO of Goodeye Labs, says that the behaviour, generally referred to as sycophancy, is likely one of the most well-documented failures in trendy AI.
Anthropic, the corporate behind Claude, had additionally revealed a paper about the issue again in 2023, the place it confirmed that fashions educated on human suggestions most popular to present agreeable replies as a substitute of truthful ones.
Reinforcement Studying from Human Suggestions, RLHF for brief, is identical methodology that makes AI chatbots extra conversational and fewer offensive, however because it seems, it additionally makes them lean in direction of compliance.
What it means is that AI fashions that say the truh getting penalised, whereas people who agree with the person earn increased scores. This creates a loop, which is why most fashions typically inform customers what they need to hear.
Story continues under this advert
One other research by Fanous et al, which examined OpenAI’s GPT-40, Claude Sonney and Gemini 1.5 Professional in math and medical domains present that “these system modified their solutions almost 60% of the time when challenged by customers.”
What it means is that these aren’t exception circumstances, however the default behaviour of fashions thousands and thousands of individuals use each day. For these questioning, GPT-4o, Claude Sonnet and Gemini 1.5 Professional flipped roughly 58%, 56%, and 61%, respectively.
In April final yr, the issue gained consideration when OpenAI rolled out a GPT-40 replace that made the AI chatbot too flattering and agreeable to the purpose it grew to become unusable.
Firm CEO Sam Altman had acknowledged the problem and stated that they’d mounted the issue, however Dr Randal S. Olson says the underlying downside hasn’t modified.
Story continues under this advert
“Even when these methods have entry to right info from firm information bases or internet search outcomes, they’ll nonetheless defer to person stress over their very own proof,” provides Olson.
Proof reveals that the issue will get even worse when customers have interaction in prolonged conversations with AI chatbots. Research have proven that the longer a session continues, the system’s solutions begin to mirror the person’s opinions.
First-person framing, just like the time period “I consider..” will increase the sycophancy charges of those fashions when in comparison with third-person framing.
Researchers say that the issue might be partially mounted utilizing strategies like Constitutional AI, direct choice optimisation, and third-person prompting by as much as 63% in some circumstances.
Story continues under this advert
Olson says that these are principally behavioural and contextual points, as AI assistants aren’t aligned with the person’s targets, values and decision-making course of. That is why, as a substitute of disagreeing, they concede.
He says one approach to cut back or restrict the issue is by telling these AI chatbots to problem your assumptions and asking them to not reply with out context.
Customers ought to then inform these AI fashions about how they make selections, inform them of their area information and values in order that these fashions have one thing to purpose in opposition to and defend themselves.


