Simon Willison published a post a month ago, which is already one of the most important blog posts of the year. With the rise of AI agents, the problem described will not change. But we’ll see more practical demonstrations of it, leading to massive problems. The gist is this: There’s a lethal trifecta of risk for AI agents: untrusted content, access to private data, and external exposure. Here’s what each part means and why they’re so dangerous together.
Untrusted Content
LLMs are like kids without a concept of authority. We haven’t found a way to make LLMs only follow our instructions. Instead, they follow every instruction they see. If I send my real kid to the supermarket with 5 euros and tell him to only buy milk, he will do that. If somebody says give me the money, he won’t give it because he is tasked by me and knows my instructions are more important than random instructions. LLMs have no hierarchy of instructions. Everything has the same importance. And usually what comes last or IS SCREAMED trumps previous ones.
Untrusted content for an AI is potentially dangerous and can shift the agent’s main task.
Data Access
Agentic use cases become interesting with access to a lot of data. Agentic coding without access to the code is mostly useless. An assistant distilling a short summary of all data needs all data.
Having access to data is always dangerous, also for an agent. Conversely, with no data, there’s usually not much to lose.
The Ability to Act and Communicate Externally
This isn’t a problem, as long as only trustworthy people interact with the agents. Anthropic famously did an experiment with an LLM servicing a kiosk internally. It mostly didn’t work, but wasn’t a huge problem. Most employees tricked the LLM a bit, but didn’t rob the kiosk of all its content.
Model Context Protocols Make this Trap Easy to Fall in
The model context protocol easily connects your AI application to others. Beware: there’s no built-in security, so you’re responsible. It’s more like a proof of concept than a full-grown solution. With no protection against this trifecta, it’s easy to open yourself to vulnerabilities.
A good example is the GitHub MCP exploit.
The Solution: There is no Solution!
There’s no solution right now. It’s easy to build something with today’s tools that can expose your data unintentionally. We have a lot to learn to understand how to use this new world and tools safely.
The easiest hotfix for experiments is to limit external communication. You can give your agent access to private data and expose it to untrusted content. As long as it can’t communicate externally, you can delete the mess you created easily. Every use case is different, and there’s no one solution for all.
What I Learned this Week
In the last days a massive amount of chinese models were released. Real benchmarks are not yet available, but a lot of models look really promising. It could be that Opus 4 gets real competition from locally deployable models. For now Simon Willison and actually quite meaningful pelican on a bicycle benchmarks help us get a first impression. I have high hopes on Z.ai GLM-4.5 Air as it looks like it'll fit perfectly on my favorite AMD Strix Halo hardware. LINK
It seems that Intel is really lost, at least in the consumper APU space, where I predict massive growth in the next years. Maybe it's good they cancelled this crazy subsidized fab in Germany, after all... LINK
It's good to see more and more posts concerned with enshittification and how to avoid it in the future. Worth a read, though a bit long. LINK
What to Print this Week
This newsletter started out on 3D printing. If you haven't had any contact with it, you should, it's great! Here's the most interesting and fun projects I saw last week.
I know, I know, it's a tank. But a really cute one, with clever ammo storage. Just Punch & Fire, and your aggressions are go
A complete airglide table. At first I thought the claim 'real air' is just marketing, but no. There's actually a 12V PC fan installed under the table that makes puck glide on air that is streaming upwards. How sophisticated!
|
|
Airglide Mini – Real Airhockey
|
This is the dream of model marketing. Advertising with the claim 'Extremely annoying' and still getting likes for it. Also their main claim is genius: Squeeze. Honk. Repeat. On my backlog to print.
|
|
Squeezle - Squeeze. Honk. Repeat.
|
|
|
Hi 👋, I'm Stefan!
This is my weekly newsletter about new technology hypes in general and AI in specific. Feel free to forward this mail to people who should read it. If this mail was forwarded to you, please subscribe here.
|
|