Why Are All the Top Local AI Models Chinese?

Illustration of Chinese and American AI systems facing off across a glowing digital globe, symbolizing competition between open local models and frontier AI platforms

Here’s a question that’s been bugging me. If you look at the leaderboards for locally hosted AI models — the ones you can actually pull down and run on your own hardware — the top of the list is dominated by Chinese labs. DeepSeek, Qwen, Yi. Every few weeks another capable model drops, open-weight, zero cost. But if you look at the frontier models, the ones pushing the absolute boundary of what AI can do, they’re almost entirely American. Anthropic, OpenAI, Google, Grok.

Why?

I don’t see how there can be any level of altruism in China’s strategy here. I’m just trying to figure out the angle. And the more I’ve turned it over, the more I think the answer isn’t about China trying to win the AI race. It’s about making sure nobody else can win it profitably.

The Fickle Market Problem

The problem with the obvious “market share” explanation is that the AI market is fickle in a way that defeats almost every traditional play for market dominance. There is no loyalty. A better model turns the market on its head almost overnight, and then it gets cast off for the next shiny object. DeepSeek V3 was the darling for about six weeks. Then R1 dropped and everyone lost their minds. Then Qwen 2.5 caught up, Claude leapfrogged, GPT responded, and the cycle reset.

The stickiness of branding or market share is so weak as to be almost meaningless. A developer can switch providers by changing a single string in a config file. I know this firsthand — my own AI gateway is literally built to rotate between providers on the fly. The switching cost is zero.

So if nobody can hold a lead at the model layer for more than a quarter, what’s the point of giving models away? You can’t build a durable moat on something the market treats as fungible.

Unless the goal isn’t to build a moat at all. Unless the goal is to make sure nobody else can build one either.

The Coachmaker Era

I keep coming back to the early days of the automobile. In the 1900s and 1910s you had steam, electric, and gasoline all competing simultaneously. Electric cars were actually ahead for urban use. You had Stanley Steamer diehards, dozens of gasoline engine configurations, and nobody could even agree on which side to put the steering wheel. Standards were nearly nonexistent beyond what coachmakers had developed, and it was absolute chaos.

AI is in that same phase right now. Transformer architectures, ternary quantization, mixture-of-experts, diffusion models — everyone is trying a bit of everything. Nobody has dominance.

Look at Microsoft’s BitNet as a case study. They’re pushing a ternary weight framework that runs 100B parameter models on CPUs instead of GPUs. The headlines make it sound like innovation. It isn’t. It’s desperation. Microsoft doesn’t have enough AI chips. Nobody does, and the price is through the roof. So someone at Microsoft Research looked at the problem and said: crap, what can we do with CPUs? We have a ton of those. And that’s what BitNet is — clever engineering born from a hardware shortage, not from a principled belief that ternary weights are the future of AI. When your breakthrough is “we figured out how to work around not having the thing we actually need,” that’s not a revolution. That’s a coping mechanism.

And the tell is right there in the release. They keep headlining “100B models on your laptop.” Nobody has actually trained and released a ternary model at that scale. The biggest real one you can download is 2B parameters. Microsoft has had the compute to train the big one for over two years and hasn’t. Either it doesn’t work at scale, or nobody internally wants to spend millions training a model that competes with their OpenAI investment. Probably both. Either way, the gap between the marketing and the reality tells you everything about where BitNet actually sits: it’s a hedge against a supply chain problem, not a vision for the future. And the 2B model they did release? It isn’t compatible with any existing inference infrastructure. You can’t run it through llama.cpp, you can’t load it in Hugging Face the normal way, you can’t slot it into any of the tooling the open-source community has spent years building. It requires its own dedicated framework with custom kernels to get any of the advertised benefits. It is not plug and play in any sense of the phrase. That’s not an ecosystem. That’s a science project.

The point is, we’re in the coachmaker era. Everyone is experimenting. Nobody has locked in. And China doesn’t need to win this phase. They just need to keep it going as long as possible.

The Forcing Function I Don’t Want

What ended the automotive chaos wasn’t the best technology winning. It was a series of forcing functions that had nothing to do with engineering merit.

Gas had effectively won the consumer market by the mid-1920s for practical reasons — Ford’s assembly line killed competitors on cost, the electric starter eliminated gasoline’s biggest usability problem, and rural America had no electrical grid to charge anything. But what truly locked it in was World War II. The military had to standardize, and the logistics argument was unanswerable. A jerry can of gasoline carries an enormous amount of energy per kilogram. It’s stable, it’s fungible, any vehicle can use it. If you went electric, you needed all those jerry cans AND generators to power your army. The tech just wasn’t there. Gas won fair and square on the battlefield, but it took the military making the call to settle things.

Batteries didn’t disappear entirely — submarines are the best example. Diesel-electric boats ran the entire Pacific war. The technology found the niche where its properties were uniquely superior and survived there. That probably happens with some of the “losing” AI approaches too.

Then Eisenhower saw the German Autobahn and pushed the Interstate Highway Act. It wasn’t built for commuters. It was built to move military equipment across the continent. Suburbanization was the side effect that became the main effect. The infrastructure made gasoline permanent in a way that no amount of market competition could undo.

The pattern is: market forces pick a rough direction, then a geopolitical forcing function standardizes it, industrializes it, and builds infrastructure that makes it irreversible.

I really don’t like that model. Because looking at the way the Middle East is going right now, AI is going to play a part in conflict. The best models will be used, and those companies will flow with cash. That cash will let them either consolidate the competition or buy more training capacity than their competitors can match. I don’t see it slowing war down. While the guns are going it will expand conflict and make it more widespread, because AI-driven intelligence and autonomous systems lower the cost of fighting without putting domestic bodies at risk. That’s the trajectory drone warfare has already demonstrated over the past fifteen years.

And the energy consumed to train and run these models starts to look like turning corn fields into fuel. It strips us of our ability to feed ourselves while we fill our tanks. Data centers are already competing with residential power grids. Microsoft, Google, Amazon, and even X are signing deals to restart nuclear plants and build gas-fired generating stations specifically for AI compute. Elon Musk is literally trucking in natural gas generators to power X’s data centers. That energy could power homes, hospitals, desalination plants. But it won’t, because the return on training a model is higher than the return on keeping the lights on in a poor neighborhood. Nobody makes that decision explicitly. It just falls out of the math.

Conflict perpetuates because decision makers make money perpetuating it. AI doesn’t change that dynamic. It accelerates it.

Palm OS Didn’t Die. It Got Eaten.

There’s another pattern worth naming here. Palm OS was amazing for what it was at the time. Multitasking concepts, a notification system, gesture navigation, an app ecosystem — all of it years before iPhone. Today everyone including Apple has stripped it for all the good stuff, and the remaining shell is ignominiously running smart TVs. So years later we have the same stuff with a different name. Stifled innovation, and the same people making boatloads of money.

The innovation survived. The innovator didn’t.

AI is probably going to follow that same path. The research papers from DeepSeek, the architectural innovations from smaller labs, the open-weight models from China — all of it will get absorbed by whoever has the capital to scale it. The innovation persists. The people who built it get a Wikipedia footnote.

Commoditization Is the Weapon

So here’s where I land. China isn’t trying to build the dominant AI model. They’re trying to prevent anyone else from building a dominant AI business. Flood the model layer with free alternatives that are good enough to make enterprise CFOs ask hard questions about paying premium prices. If you can’t charge a premium, you can’t fund the next leap. The frontier labs need massive revenue to justify massive training runs. If DeepSeek and Qwen keep releasing 90%-as-good models for free every quarter, that revenue base erodes. Not all at once. Quarter by quarter.

And there’s a transparency asymmetry that doesn’t get talked about enough. It is far easier to test and benchmark a local model than a proprietary data center-driven frontier model. When an enterprise evaluates Qwen, they can pull the weights, run their own evals on their own data, audit the behavior in detail. With a closed model, you’re trusting the vendor’s benchmarks and your limited API testing. For a risk-averse buyer, that transparency genuinely favors the open model regardless of raw capability.

This strategy works even if Chinese models are never the best. They just have to be good enough, often enough, to keep the treadmill spinning and the margins thin. Make local models just good enough that the shine tarnishes on the big frontier models. Keep the board contested. Keep costs low. Wait.

That’s a much more traditionally Chinese strategic posture than trying to win a head-to-head capability race. China has been playing this exact game for forty years. Steel, solar panels, telecom equipment, electric vehicles — the playbook is always the same. Undercut on price, flood the market, collapse the margins for competitors, and wait for the other side to either give up or get absorbed. It works because China can subsidize the losses longer than anyone else can sustain the fight. AI is just the latest industry to get this treatment. You don’t have to win. You just have to make sure the other side can’t consolidate a dominant position.

Which brings me back to where I started. Why are the top local models Chinese and the frontier models American? Because that’s the whole game. China commoditizes the floor. America tries to build a ceiling high enough to justify the price of reaching it. And the rest of us are in the middle, swapping model strings in config files, wondering which forcing function is going to settle things — and hoping it isn’t the one that’s settled these questions historically.

I keep hoping we find a better way to pick winners than war. History isn’t encouraging on that front, but I figure it’s worth saying out loud.

Leave a Reply

Your email address will not be published. Required fields are marked *