Rendered at 05:01:32 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
2001zhaozhao 2 hours ago [-]
I think it's good to note that there are scenarios where the trend of open models keeping up will not continue forever.
AI development speed is increasingly influenced by the quality of the model you are able to use internally. The frontier labs could easily pull ahead again if they increasingly withhold their best models (e.g. Claude Mythos) from the public entirely. They will benefit from increased R&D speed internally that cannot be matched by open labs.
Also, it seems plausible that frontier labs will eventually accumulate architecture-level improvements in their models that make them significantly more efficient than open LLMs. In that case, if the open labs cannot reverse engineer and replicate that, then their models will forever fall behind. On the other hand, any architectural innovations in the open model space can be used freely by closed frontier labs.
There's a counter-trend that favors open models however, which is that the design of model harnesses and multi-agent systems is more and more important to AI quality today relative to the intelligence of the model itself. This MIGHT mean that having a bunch of dumb, but cheap models in the right harness can actually compete very well against raw frontier intelligence in most practical tasks. (In other words, better harnesses makes models more efficient at improving their task completion by spending extra tokens.) This would give cheaper open models an advantage in any task where they're smart enough to complete at all, since a good multi-agent harness might mean they can do these tasks reliably and typically for cheaper than frontier models even if pushing a higher number of raw tokens.
danny_codes 2 hours ago [-]
This seems like a wildly unlikely risk. Innovations in this space are just mathematical ideas, easy to write down in a paper and replicate.
It’s much more likely that performance will plateau and open weights will catch up asymptotically
oofbey 2 hours ago [-]
Mathematical ideas are very difficult to protect. But models can also be improved with brute force improvements in size. Imagine Mythos is a 32 trillion parameter model for example. That could be very difficult to replicate even though everybody knows exactly how it works.
echelon 2 hours ago [-]
> It’s much more likely that performance will plateau and open weights will catch up asymptotically
I really don't think so. This almost never structurally happens.
I think it'll be more like Linux on the Desktop.
Or Ubuntu on the smartphone.
Or Firefox.
We'll have open weights, but 99% of everything will go through hyperscalers.
esseph 2 hours ago [-]
> I really don't think so. This almost never structurally happens.
> I think it'll be more like Linux on the Desktop.
I think it will be Linux on the server, or the one that runs your watch, your phone, the radio or infotainment system in your car, maybe your thermostat, a bunch of medical devices and military devices, running in space shuttles and space stations and... You get the point. It's on everything.
hnav 2 hours ago [-]
Superior architectures will leak pretty quickly via engineers. Withholding your best models doesn't work unless you have no competition.
ninjagoo 42 minutes ago [-]
> Superior architectures will leak pretty quickly via engineers.
I agree with the outcome of your premise (i.e., openness), but for different reasons:
First, isn't it the case that these bleeding edge 'newfangled' LLMs are basically variations on the same core ideas from "Attention Is All You Need" from 2017? [1]. Different scale, but still the same basic architecture. Even the "MoE" innovation keeps the Transformer attention stack while replacing or augmenting the dense feed-forward/MLP part with routed expert blocks.
And, I would argue that Engineers aren't working on new architectures. That would be Researchers, working on
That research is still open, so the outcome that you propose (openness) is likely to come to pass. Researchers/Scientists gotta publish, otherwise it's not science (to quote LeCun [2])
> Withholding your best models doesn't work unless you have no competition.
It could also work if you DO have competition but your compute capacity is overbooked anyway, so releasing the better model doesn't actually make you that much more money (except for raising prices for the same amount of compute, which would give limited gains).
This is pretty much the situation Anthropic is in today.
hnav 25 minutes ago [-]
That just means that Anthropic is fucked unless they get more capacity.
ninjagoo 1 hours ago [-]
> American AI was financed on a particular bet. The bet was that frontier models would be the next great monopoly business
> The collision between those two facts — that American capital paid for a moat, and that the technology no longer provides one — is the most important force in the AI industry today.
> The open-weight ecosystem did not arrive in stages. It arrived in a wave. In late 2024, a Chinese lab named DeepSeek released a model
Looking at the assertions above, anyone passingly familiar with AI over the past few years will tell you that open weights and open research were the norm until OpenAI GPT-3 came along, and even then they were forced to release GPT-OSS by the market. So what technology moat? There has never been one in AI. Training 100B+ or trillion+ parameter models in expensive runs was potentially a moat, until the chinese startups showed in short order that it could be done for $6 million a run. Even the CUDA monopoly seems to be ending.
Also, no evidence referenced to back up any of the assertions. How do they know that the bet was that the frontier models would be the next great monopoly business? Especially when there were many from the outset: GPT, Anthropic, Llama, Deepmind, etc. etc.
I'd argue that the wholesale replacement of labor was and is the driver behind the capex, not monopoly dreams.
The starting premises appear to be, well, faulty. Whither the rest of the article?
GenerWork 2 hours ago [-]
There's a fourth option: the frontier labs are used by businesses who require (or think they require) the very best models and want to outsource model compliance such as HIPAA to someone else, and then individuals/smaller companies use the open source models.
2 hours ago [-]
chvid 1 hours ago [-]
Grab the weights. Seize the means of production. Tech workers of the world unite.
pkilgore 2 hours ago [-]
I can't tell if this was written by AI or by an author that has absorbed all of its worse tendencies, but past the bullet list at the front this was terrible writing. It's like the author was trying to meet a page requirement.
52 minutes ago [-]
ramoz 2 hours ago [-]
> DeepSeek, Qwen, Kimi, GLM — running on the LangChain, vLLM, llama.cpp, and Ollama stack
"running on the LangChain" ??
EDIT: look, I think the general discussion is important, so I don't want to denounce the article. I, for one, am excited for better control, ownership, and accessibility of models. The ride labs take us on can be quite frustrating. Maybe there's even signal that the model progression is stalling (ie Opus 4.7). If that's true, then some of the notions made in the article are important to discuss. Ref https://x.com/ClementDelangue/status/2046622235104891138?s=2...
EDIT: this is not a complaint about the grammar. Look at my reply in the comments.
ramoz 2 hours ago [-]
There was a bigger opportunity here to mention OpenCode, Pi, etc - open Harnesses that provide accessibility to the oss models, a platform others can build around, and something enterprises can adopt in reliable ways; for the most dominant use case of AI today.
carterschonwald 2 hours ago [-]
more than that, its pretty clear that there is an insane underinvestment in the harness layer. ive been iterating on my own ideas in that area through the lens of increasing reliability. and holy crap is there so much low hanging fruit. i literally can’t figure out a sustainable way to do the work without commercializing at that layer
weikju 2 hours ago [-]
Running on the stack consisting of Langchain etc
Yeah not sure about both ollama and llama.cpp though lol
QuesnayJr 2 hours ago [-]
"the" is connected to "stack", not "LangChain". "LangChain" is a adjective that modifies stack.
Cutting it off at "the LangChain" is like if I took the first sentence of your edit and said "look, I think the general" ?? You think the general?
ramoz 1 hours ago [-]
I wasn't complaining about grammar. I didn't even notice that.
retr0rocket 2 hours ago [-]
[dead]
renewiltord 2 hours ago [-]
Indeed, one day people told me they're running on the LAMP stack, and I said what's that and they said Linux Apache MySQL and PHP and I said "running on the Linux"? Everyone laughed really loud and the people who told me that were run out of the room. I then went to their desks and pissed on it because I'm quite clever that way. They let you do that when you're smart.
ramoz 2 hours ago [-]
Rigghht. If you're trying to make a point, go ahead and be straighter about it.
renewiltord 10 minutes ago [-]
Straighter about it??
cmwelsh 2 hours ago [-]
Can you contribute to our community when writing this stuff? Put it in a blog post, think about it, etc. You’ve been around long enough to know this is trash shit.
renewiltord 10 minutes ago [-]
[dead]
sergiotapia 2 hours ago [-]
One anecdote: I've been using Kimi K2.6 exclusively for code work for about 1 week now (since it was released) with opencode. I don't really miss claude/codex. Kimi is much faster and cheaper and good enough for all my uses cases writing elixir and ruby code.
AI development speed is increasingly influenced by the quality of the model you are able to use internally. The frontier labs could easily pull ahead again if they increasingly withhold their best models (e.g. Claude Mythos) from the public entirely. They will benefit from increased R&D speed internally that cannot be matched by open labs.
Also, it seems plausible that frontier labs will eventually accumulate architecture-level improvements in their models that make them significantly more efficient than open LLMs. In that case, if the open labs cannot reverse engineer and replicate that, then their models will forever fall behind. On the other hand, any architectural innovations in the open model space can be used freely by closed frontier labs.
There's a counter-trend that favors open models however, which is that the design of model harnesses and multi-agent systems is more and more important to AI quality today relative to the intelligence of the model itself. This MIGHT mean that having a bunch of dumb, but cheap models in the right harness can actually compete very well against raw frontier intelligence in most practical tasks. (In other words, better harnesses makes models more efficient at improving their task completion by spending extra tokens.) This would give cheaper open models an advantage in any task where they're smart enough to complete at all, since a good multi-agent harness might mean they can do these tasks reliably and typically for cheaper than frontier models even if pushing a higher number of raw tokens.
It’s much more likely that performance will plateau and open weights will catch up asymptotically
I really don't think so. This almost never structurally happens.
I think it'll be more like Linux on the Desktop.
Or Ubuntu on the smartphone.
Or Firefox.
We'll have open weights, but 99% of everything will go through hyperscalers.
I think it will be Linux on the server, or the one that runs your watch, your phone, the radio or infotainment system in your car, maybe your thermostat, a bunch of medical devices and military devices, running in space shuttles and space stations and... You get the point. It's on everything.
I agree with the outcome of your premise (i.e., openness), but for different reasons:
First, isn't it the case that these bleeding edge 'newfangled' LLMs are basically variations on the same core ideas from "Attention Is All You Need" from 2017? [1]. Different scale, but still the same basic architecture. Even the "MoE" innovation keeps the Transformer attention stack while replacing or augmenting the dense feed-forward/MLP part with routed expert blocks.
And, I would argue that Engineers aren't working on new architectures. That would be Researchers, working on
That research is still open, so the outcome that you propose (openness) is likely to come to pass. Researchers/Scientists gotta publish, otherwise it's not science (to quote LeCun [2])[1] https://arxiv.org/abs/1706.03762
[2] https://x.com/ylecun/status/1795589846771147018
It could also work if you DO have competition but your compute capacity is overbooked anyway, so releasing the better model doesn't actually make you that much more money (except for raising prices for the same amount of compute, which would give limited gains).
This is pretty much the situation Anthropic is in today.
> The collision between those two facts — that American capital paid for a moat, and that the technology no longer provides one — is the most important force in the AI industry today.
> The open-weight ecosystem did not arrive in stages. It arrived in a wave. In late 2024, a Chinese lab named DeepSeek released a model
Looking at the assertions above, anyone passingly familiar with AI over the past few years will tell you that open weights and open research were the norm until OpenAI GPT-3 came along, and even then they were forced to release GPT-OSS by the market. So what technology moat? There has never been one in AI. Training 100B+ or trillion+ parameter models in expensive runs was potentially a moat, until the chinese startups showed in short order that it could be done for $6 million a run. Even the CUDA monopoly seems to be ending.
Also, no evidence referenced to back up any of the assertions. How do they know that the bet was that the frontier models would be the next great monopoly business? Especially when there were many from the outset: GPT, Anthropic, Llama, Deepmind, etc. etc.
I'd argue that the wholesale replacement of labor was and is the driver behind the capex, not monopoly dreams.
The starting premises appear to be, well, faulty. Whither the rest of the article?
"running on the LangChain" ??
EDIT: look, I think the general discussion is important, so I don't want to denounce the article. I, for one, am excited for better control, ownership, and accessibility of models. The ride labs take us on can be quite frustrating. Maybe there's even signal that the model progression is stalling (ie Opus 4.7). If that's true, then some of the notions made in the article are important to discuss. Ref https://x.com/ClementDelangue/status/2046622235104891138?s=2...
EDIT: this is not a complaint about the grammar. Look at my reply in the comments.
Yeah not sure about both ollama and llama.cpp though lol
Cutting it off at "the LangChain" is like if I took the first sentence of your edit and said "look, I think the general" ?? You think the general?
And it'll only get better/cheaper.