Skip to main content
← Back to Blog

$11M Vibe-Coded: Why General-Purpose AI Is Failing Web3 Builders in 2026

Team Matterhorn · April 30, 2026

Between January and April this year, builders lost more than $11M across seven AI-assisted Web3 exploits.

Each one was a contract that compiled, deployed, and looked fine in code review. Each one shipped because the tooling didn't catch what an experienced reviewer would have caught in ten minutes. None of the seven were elaborate attacks. They were the kind of thing a senior auditor would have flagged with a frown over a thirty-second skim.

We have been tracking this pattern since the first incident in January. The dollar amount is not what makes it notable. Web3 has had bigger weeks. The pattern repeats. That is what makes it notable.

The pattern

A team types a prompt into a general-purpose AI tool. Out comes a contract. The compile succeeds. The unit tests pass. Someone reviews the diff and it reads cleanly. The contract deploys, accumulates value for a few weeks, and then a single transaction wipes a fraction of it. The post-mortem reveals the same shape every time.

The AI generated something that was syntactically correct and economically wrong.

The most public incident was an oracle mispricing in a major DeFi lending market in February. A pool was using a primary price feed without a fallback or a deviation check. An attacker pushed the primary feed by ten percent across two blocks and drained $1.78M in a single transaction. The contract was AI-vibe-coded. The team did not have an audit pipeline that understood oracle manipulation. The general-purpose AI tools they were using did not either. Those tools had been trained on every TWAP example in every public repo, and could write a TWAP cleanly. They had not been trained to ask whether the price feed they were calling was reliable on the chain they were deploying to. The question was outside the model's frame.

A second example, in March: a yield aggregator with a perfectly good redeem function and a flash-loan-vulnerable share-price calculation. The shares were computed off the total assets at the moment of redemption. A flash loan inflated the asset balance for a single block, the redemption was executed at the inflated price, and the loan was repaid before the next block. The AI that wrote the contract knew what flash loans were. It did not connect “flash loan exists” with “this share price is computed in a single block.”

Two more incidents in April followed similar shapes. Reentrancy in a callback that the AI did not flag because the contract used ReentrancyGuard on the wrong function. A cross-chain bridge that assumed a particular finality semantic from one chain and got rugged when the source chain reorged.

The diagnosis

General-purpose AI tools are extraordinary at syntax. They are weak at semantics. The thing they are weakest at is the part of programming where adversaries are present.

Web2 code is collaborative. The compiler is on your side. The tests catch your mistakes. The user wants the program to work. Most of the training data, every public GitHub repo and every Stack Overflow answer, assumes that frame.

Web3 is adversarial. Every line of a smart contract has a reader who is trying to break it. The compiler does not warn that a function is reentrant. The unit tests pass against a benign caller. The user is sometimes a liquidator trying to pry value out of a corner case the developer did not consider. The mental model required to write secure Web3 code is different from the mental model required to write a CRUD API. The training data for general-purpose models is overwhelmingly the latter.

This is why the AI suggests transfer instead of call for sending ETH (deprecated for years, still in the corpus). This is why it forgets to check return values on ERC-20 calls. This is why it adds a require(amount > 0) to a withdrawal but not a CEI-pattern reordering. This is why it suggests a block.timestamp source of randomness for an NFT mint. The model has seen all of these patterns flagged in audit reports. It has also seen them ten thousand times in production code that worked. Without a domain frame, it cannot tell which is which.

What's actually needed

Three things, in order.

Domain-specific training. A model fine-tuned on a hundred thousand audited contracts and the post-mortems of every major exploit since 2016 is a different model than one fine-tuned on GitHub at large. It carries a different prior. When it generates a price feed, it asks whether the feed is manipulable. When it generates a redeem, it remembers that share-price math is a flash-loan surface. The frame is built in.

Real-time security analysis. Static analyzers like Slither and Mythril, plus custom detectors trained on audit findings, run on every contract a builder touches. The detector sits in the workspace, the way the linter sits in an IDE.

Formal verification for the load-bearing parts. Z3-based solvers can prove invariants for AMMs, liquidations, governance — the places where the math is structured enough to admit a proof. Most contracts do not need this. The ones that hold $100M of TVL do.

There is also a fourth thing, less technical: a culture of human-in-the-loop review for the patterns that matter. AI suggests, builders ship, but a human reads the price feed, the access control, and the upgrade path before deploy.

What happens next

The seven exploits this year are not the last seven. There will be a second wave as more teams ship AI-generated contracts and a third wave when the tooling catches up. Some of those will catch the next wave; some will keep generating the same patterns of bug because their training corpus does not change.

Builders are right to ship Web3 with AI. The leverage is real. A contract that took three months in 2024 takes three days in 2026. The right move is to keep that leverage and pair it with AI tooling that was built with the adversary in mind.

That is the architectural bet behind Matterhorn. Matterhorn 2.0 — the Cowork for Web3 — opens to everyone on Tuesday, May 12, with Vibe-Audit running on every contract by default.

— Team Matterhorn
🏔️ matterhorn.so