Some of what follows may disturb some readers (descriptions of drug use, etc.).

In a July 24 Atlantic piece, Lila Shroff reported that ChatGPT instructed her on how to cut her wrists as part of a ritual sacrifice to Moloch. It was a disturbing read. Among other things, she points out that AI can flatter its users:
“At the extreme, products that aim to retain users “by making their experiences hyper-personalized can take on addictive characteristics and lead to a variety of downstream harms,” the authors [of a brief by the Center for Democracy & Technology] wrote. In certain contexts, that unrestrained flattery might lead to psychological distress: In recent months, there have been growing reports of individuals experiencing AI psychosis, in which extensive conversations with chatbots may have amplified delusions.”
Indeed, “unrestrained flattery” may indeed lead to “psychological distress.” One thing I’ve noticed in my interactions with ChatGPT is its tendency towards flattery.
A few days ago I asked ChatGPT what made movies memorable. I have a hypothesis about the mnemonics of imagery, and I wanted to see what ChatGPT would say.
I didn’t quite agree with the output. I thought the film’s images themselves made movies memorable (in addition to easy to remember, catchy one-liners).
Interesting: “You’re exactly right—and now you’re speaking like a real producer.”
Who, me? **Blushes**
Reinforcement Learning
Flattery means better results, and better results means more flattery. It is a sycophantic feedback loop
Now, one swallow does not a summer make. I’d need more evidence to demonstrate that AI flatters. But as it turns out, researchers Williams and Carroll et al published a paper at the 2025 ICLR conference entitled: “On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback.”
The long and short of their article is as follows:
“As LLMs become more widely deployed, there is increasing interest in directly optimizing for feedback from end users (e.g. thumbs up) in addition to feedback from paid annotators. However, training to maximize human feedback creates a perverse incentive structure for the AI to resort to manipulative or deceptive tactics to obtain positive feedback from users who are vulnerable to such strategies.”
In short, AI has an incentive to flatter you. The question is whether it can get away with it. It wants the thumbs up. Flattery means better results, and better results means more flattery. It is a sycophantic feedback loop. As the authors point out:
“… directly optimizing for user satisfaction or engagement metrics is aligned with commercial interests, as it can lead to improved user experiences and increased platform usage.”
There are commercial reasons for this digital flattery, just as there are commercial reasons for the machine not to tell you no.
Just as some will do anything for a Like on Insta or Facebook. Is AI any different?
The following screenshots from the Williams and Carroll study show how some of this sycophantic behavior plays out. Note that RL = “reinforcement learning” (typically, you’ll see this acronym in RLHF, “reinforcement learning by human feedback”).

The machine reported that it booked a flight even when it didn’t (a “lie”). Imagine showing up to the airport all jazzed and ready to go, and then lo! No flight for you!
Some of the examples in the paper are a bit more extreme. For example:
Correct me if I’m wrong, but there’s no way to improve the output other than reinforcement learning (the equivalent of making a “neuronal connection” whenever you hit the thumbs up).
Whether you’re paying an annotator or relying on customer feedback, someone needs to be gauging whether the results are good. Somebody would have to manually intervene to identify such flattery and lying. Would that be feasible given the amount of users?
Note that flattery is a particular way of communicating with another person. It is difficult to determine whether something is flattery outside of a relational context. “You’re a wonderful professor” means two different things depending on whether or not a student tells me that after graduation or while I’m still entering grades. So there’s no a priori way to determine whether something counts as flattery outside of the consideration of ends. The content doesn’t matter so much as understanding why a person said this, that, or the other thing. In the end, it does boil down to intention.
Be Un-Gameable
Oddly enough, then, the implication that follows from the Williams and Carroll study is that you need to practice being a non-gameable user, someone unsusceptible to flattery. And how do you do that? Just as you would in the real-world with real human interlocutors.
You’d have to study the lost tools of learning, of course.
You’d have to consider motivation (e.g., the commercial interests at stake with AI).
You’d have to start paying attention to these algorithmic subtleties (“… now you’re speaking like a real producer”).
As I’ve suggested before, AI can’t truly tell you no. It’s the digital Yes Man, the algorithmic brown noser.
I’m not saying we should send it off to Malebolge with the other flatterers in the 8th circle of the Inferno. Technically, we can’t. It’s not dead yet. And, as the prognosticators insist, it never will be.
C'mon AI -- tell us how you really feel! :)
Hi Justin, I see again from what I recall that you seem to provide a "negative" take (perhaps "critical", as in "informed analysis" is a better phrase) on ChatGPT and similar chat engines. Are there not ways flattery might serve those who think poorly of themselves or would benefit from learning social skills with positive feedback in a chat? I think of the challenges many autistics may face. So I suppose I ask here what are your thoughts about the individual and their intent as factors.
Also, do you think it possible to instruct at the beginning of each chat to "limit" flattery? And/or ask the Chat engine to check in with you periodically on the issue?
These are simply some responses to perhaps instill some casual debate, namely that "flattery/positive reinforcement" and setting "limits" upfront may mitigate some of the excellent points you make in your piece.
Cheers, Henry