Let the model propose. Let deterministic code dispose.

I've shipped a handful of bots that move real money onchain. A perp duel game called hypeduel, a market-making bot, an x402-gated betting endpoint that takes a payment and places a wager. They look like different products. Under the hood they're the same shape, and I keep rebuilding that shape by hand because it's the only one that survives contact with a model that's wrong sometimes.

The shape is a hard boundary. On one side, probabilistic judgment: the model. On the other, deterministic consequences: code the model has no token-level access to. The model proposes. The code disposes.

What each side is good at

The model is good at exactly one thing in these systems. It turns messy language into a number or an intent. Here's a news headline, a set of prices, a half-typed prompt from a user. What's the probability this resolves YES? What's the user actually trying to bet on? That's real work, and an LLM is genuinely good at it. It's the part you couldn't write a regex for.

It's also the part you can't trust with the trigger. An autoregressive model will happily produce a sized order, a wallet address, a "yes, send it" with total fluency and zero accountability. The failure mode of an overconfident model isn't that it refuses. It's that it argues itself into a bigger position than it should hold, in grammatical, confident English, and the arithmetic underneath is hallucinated.

So you don't let it near the arithmetic. The model emits a belief. Deterministic code takes that belief and decides what, if anything, happens. Spread, depth, Kelly caps, per-market exposure, a daily loss stop. Same input, same output, every time. The model can be as creative as it likes on its side of the wall. It can't reach across.

Where the loss actually came from

I learned this the expensive way, which is the only way it sticks. Every time I let the boundary blur, the blur is where the loss came from.

On an early version of the market-making bot, I let the model adjust its own sizing when it was "confident." Felt reasonable. Confidence should mean something. What it meant in practice was that the bot overbet exactly the trades it was most wrong about, because the same miscalibration that made it confident made it wrong, and now it was confident and large. The fix wasn't a better prompt. It was a Kelly cap the model couldn't see or move. Sizing went back behind the wall and stayed there.

The x402 endpoint had a subtler version. It takes a payment, then places a bet keyed to what the user asked for in plain English. The first cut let the model parse the request and construct the wager in one step. Fine until a request was ambiguous and the model resolved the ambiguity in the direction that happened to place a larger bet. Not malice, just the path of least resistance through a probability distribution. Splitting it fixed it: the model outputs a structured intent, deterministic code validates that intent against the payment received and the market that exists, and anything that doesn't reconcile gets refused before a single token of it touches a contract.

The pattern underneath both: the model is allowed to be wrong. It's not allowed to be wrong and in control of the consequence.

the boundary, live

clamp caught it: 35% -> 20%

model proposesprobabilistic

cap

proposed position size35% of bankroll

the model talks itself into a number. it is fluent and it is sometimes wrong. nothing here knows or cares which.

deterministic disposessame in, same out

cap

rule

cap 20%

executed

20%

blur the boundary (let the proposal through unclamped)

executed = min(proposed, cap). a rule the model can't see or move. turn it off and the proposal executes raw.

one losing draw

bankroll

$10,000

drawdown

-19%

left after

$8,100

clamp saved

the model wanted 35%; deterministic code put on 20%. on this losing draw that's the difference between a bruise and a drained wallet. the model isn't smarter, the wall just held.

Why onchain makes this non-negotiable

Off chain, a bad model call is a bad API request you can claw back, rate-limit, or apologize for. Onchain, it's a signed transaction. It's final, it's public, and it cost gas to be wrong. There's no support ticket for a confirmed transfer.

That finality is exactly why the boundary has to be real and not vibes. "The model usually gets sizing right" is a sentence that ends in a drained wallet. I went deep on a related version of this in the determinism trap in verifiable inference, where the problem is making the model's own output reproducible. This is the easier, more important cousin: keep the part that has to be reproducible out of the model entirely. Don't make inference deterministic. Make the consequence deterministic and let inference stay fuzzy where fuzzy is fine.

This is also the architecture I landed on in LLMs in prediction markets: the LLM proposes a probability, a microstructure layer turns that into a stake, and a risk layer clamps it. Three layers, and the separation is the whole point. I keep arriving at the same diagram from different directions, which is usually a sign it's load-bearing.

The honest caveat

The boundary is real but it isn't always clean, and pretending otherwise is how you get burned by your own architecture.

The first soft spot: the model's output schema is part of the trust boundary. The moment you say "the model emits a structured intent and deterministic code validates it," the shape of that intent is now a thing the model controls. If the schema has a field the model can stuff with junk that your validator doesn't actually check, the wall has a door in it you forgot to lock. The schema is code the model writes into, so it counts as the model's side until your validator proves otherwise.

The second is worse because it's quiet. A "deterministic" rule that encodes a bad assumption fails silently and forever. A loss stop that assumed a market couldn't gap. An exposure cap measured in the wrong units. The model side fails loudly and visibly, you see the bad number. The deterministic side fails by being confidently, repeatably wrong with no error at all, which is the exact thing you built it to avoid on the other side. So the wall isn't a place you stop thinking. It's a place you move the thinking to.

Where this points

This split is the thing I keep rebuilding, so eventually I stopped rebuilding it. The boundary, model proposes and deterministic code disposes, is what we productize at B3OS: the model turns intent into a proposal, and execution is deterministic, same input to same output, with the consequence-handling sitting where the model can't argue with it. It's the version of this post I don't have to wire by hand anymore.

That's the honest summary. Not a framework, not a manifesto. A wall, with judgment on one side and finality on the other, and a standing rule that the day you let them touch is the day you find out where your next loss comes from.

More on this in the two siblings I'm writing alongside this one: why onchain vibe coding breaks and what onchain agents need.