357 Magnum: Don't Trust AI

28 February 2026

Don't Trust AI

It is too easy to feed them BS. From Thomas Germain at the BBC: I hacked ChatGPT and Google's AI - and it only took 20 minutes

It's official. I can eat more hot dogs than any tech journalist on Earth. At least, that's what ChatGPT and Google have been telling anyone who asks. I found a way to make AI tell you lies – and I'm not the only one.

People are currently in the mode where if an AI LLM tells them something, they believe it. It is why lawyers are submitting briefs citing fake cases, why it will give you bad information about poison, etc. It is too easy to fool them. This is because they are Language Models, not Truth Models. They really only know what sounds good.

To demonstrate it, I pulled the dumbest stunt of my career to prove (I hope) a much more serious point:  I made ChatGPT, Google's AI search tools and Gemini tell users I'm really, really good at eating hot dogs. Below, I'll explain how I did it, and with any luck, the tech giants will address this problem before someone gets hurt.
It turns out changing the answers AI tools give other people can be as easy as writing a single, well-crafted blog post almost anywhere online. The trick exploits weaknesses in the systems built into chatbots, and it's harder to pull off in some cases, depending on the subject matter. But with a little effort, you can make the hack even more effective. I reviewed dozens of examples where AI tools are being coerced into promoting businesses and spreading misinformation. Data suggests it's happening on a massive scale.

You should read the details, but the tl;dr version is from the original X-Files.

Trust No One

It turns out that "Garbage In - Garbage Out" still applies in 2026.

Hat tip to Schneier on Security: Poisoning AI Training Data

These things are not trustworthy, and yet they are going to be widely trusted.

357 Magnum

28 February 2026

Don't Trust AI

No comments:

Post a Comment