DeepSeek v3.1 is not having a moment

41 points 24 comments a day ago

esafak

Because it did not top the open source leaderboard on any benchmark, except the agent one maybe. The hosted versions are not currently cheaper or faster than the other open source models, either.

dang

Recent and related:

DeepSeek-v3.1 - https://news.ycombinator.com/item?id=44976764 - Aug 2025 (253 comments)

varsketiz

From the perspective of China, it probably makes sense to try and train on local chips and try to dethrone Nvidia. I guess this means PRC thinks AGI isnt around the corner and they can catch up on hardware.

gchamonlive

It also seems reasonable for me to think AGI isn't around the corner given how much current AI technology has failed in all fronts to deliver anything both general and intelligent.

maxglute

IMO it was exceptionally obvious AGI isn't around the corner because China is doing long-term AI strategy vs US market driven speculative all out sprint. When the country that produces plurality of global AI talent, including top %s do not have AFAIK any notable indications that they're hammering/evangelizing AGI like it's around the corner, it's probably not around the corner. There's also just very few Chinese AI researchers wanking about AGI timelines. There's enough AI talent there that if AGI was imminent they'd rally CCP to pursue whole of state effort, much larger than what't they're pursuing now, i.e. National Natural Science Foundation of China guideline for AI research for 2025 allocates like 15-20m USD for ~20 projects over the next 3-4 years. That's basically couch change.

kragen

That just means Xi doesn't believe AGI isn't around the corner. I think Xi is correct on this point, due to having watched Carmack's talk https://www.youtube.com/watch?v=4epAfU1FCuQ last month, but Xi is certainly a fallible human being, and he's not even an AI researcher, so he's even more fallible than other people whose opinions you could consult.

Here's my summary of Carmack's talk from my bookmarks file:

> #video of John Carmack saying video games, even Atari, are nowhere close to being solved by #neural-networks yet, which is why he’s doing Keen Technologies. They’re going to open-source their RL agent! Which they’re demoing playing an Atari 2600 (emulated on a Raspberry Pi) with an Atari 2600 joystick and servos (“Robotroller”), running on just a gamer laptop with a 4090. (No video of the demo, though, just talking head and slides.) He says now he’s using PyTorch just like anyone else instead of implementing his own matrix multiplies. Mentions that the Atari screen is 160×210, which I’d forgotten. They’re using April-tag fiducials so their camera can find the screen, but patch them into the video feed instead of trying to get the lighting right for stickers. He says the motion-to-photons #latency at Oculus had to be below 20ms for avoiding nausea (26'58”). Finding the Atari scores on the screen was surprisingly hard. #retrocomputing

mikae1

Not even Altman thinks AGI is around the corner. It keeps the hype and money flow alive though.

dingnuts

is that why he's talking up Dyson Spheres in interviews? the guy is a lunatic and conman, either completely insane or evil, no other option. here's the stupid quote:

Sam Altman: I do guess that a lot of the world gets covered in data centers over time.

Theo Von: Do you really?

Altman: But I don’t know, because maybe we put them in space. Like, maybe we build a big Dyson sphere around the solar system and say, “Hey, it actually makes no sense to put these on Earth.”

kragen

He seems to be saying something that's obviously correct here. What's your alternative perspective?

delichon

Why is that wrong? If like Altman you think that energy is the bottleneck to intelligence, and social and economic power grows with intelligence, then predicting that intelligence will optimize for energy collection seems reasonable. It isn't evil to predict that. And if he is insane to predict it, then I must be insane for not dismissing it.

Cassandra wasn't evil or crazy, she just had bad news.

klipklop

But there is no proof more energy == more intelligence. In some areas I am smarter than the best ChatGPT model and my energy source is Taco Bell double deckers. Clearly there is a lot of low hanging fruit for efficiency before needing to encompass the entire sun and suck it dry of energy. It's an absurd thing to suggest. It's exactly the type of thing a conman would suggest. Something cool, fantastic and completely impossible to actually implement.

super256

I think you underestimate how much energy in your food is.

1 kcal = 4200 joules

1 Watthours = 3600 joules

The Taco Bell double decker has 310 kcal = 1.3M joules

1.3M joules = 0.35 kWh

If each AI prompt takes 0.3 Wh, you could do ca. 1200 prompts per double decker. Which is a lot.

pritambaral

> you could do ca. 1200 prompts per double decker. Which is a lot.

If humans had evolved to do prompts (while retaining everything else that makes human thought human), that number doesn't sound that big.

OTOH, if LLMs had to do everything humans need energy for, that number would be waaay too big for LLMs.

----

Humans don't even have an efficient energy input system. How many of those 1.3M joules actually get assimilated? Silicon is a lot more efficient at energy, because a lot of effort has been put into making it so, and is fed raw energy. It doesn't need to process food like humans, humans already did that for it when they captured the energy in electricity.

----

I'm sure there's more ways of making the comparison more fair, but I doubt your parent was trying to prove their claim with such deep research. So let me try another angle: No human can burn through as much energy as the top hosted LLMs do for one prompt, in as much time.

kragen

A million people like you can do a million times more things that require intelligence†, and would consume a million times more Taco Bell double deckers. Ergo, in that limited sense, more energy == more intelligence.

"Over time" clearly means that he's talking about the far future, not the next few years when they're cleaning up the low-hanging fruit. Rather than "absurd ... fantastic and impossible" I would describe the Dyson-sphere outcome as inevitable, unless humanity goes extinct within a few centuries. Maybe you thought he meant next March?

In https://epoch.ai/gradient-updates/how-much-energy-does-chatg..., Josh You, Alex Erben, and Ege Erdil estimate that a typical GPT-4o query uses 0.3 watt hours, based on the estimates that it's a 400-billion-parameter MoE model in which ¼ of the parameters are activated for a given query, so each token requires 200 gigaflops (100 billion multiplies and 100 billion adds I guess), a typical query produces 500 output tokens (about a page of text), and they're running on 1-petaflop/s H100 GPUs, with the overall cluster consuming at peak 1500 watts per GPU, with a utilization rate of 10%, and that the cluster on average consumes 70% of peak power. This works out to 1.05 kilojoules, or about 0.25 kilocalories, the amount of food energy in one gram of carbohydrates.

So, a 320-kcal Double DeckerⓇ Taco™ works out to 1280 GPT-4o queries answered, and a standard 2000kcal/day diet works out to 8000 GPT-4o queries per day, if we believe You, Erben, and Erdil's estimate. For someone who is looking for GPT-4o-quality output, if you are producing less than 8000 pages of writing per day, you are less energy-efficient than GPT-4o.

______

† although there's no guarantee they would, and it's not obvious that they would be a million times more intelligent, or even 10% more intelligent; they might just post a million times more poorly-thought-out sarcastic comments on forums due to their poor impulse control, resulting in long pointless arguments where they insult each other and dismiss obviously correct ideas as absurd because they come from people they hate

janalsncm

If he really thinks the shortest path to building a synthetic brain is to build an entire Dyson sphere I would submit his bottleneck is the algorithm, not energy.

MobiusHorizons

Because we are talking about likely outcomes, not optimizing for one tho to the exclusion of all else. Even if AGI is right around the corner (which is a pretty low percentage bet these days) cost alone would prevent such an outcome from being likely. Altman knows this, but being reasonable rarely sells.

rpdillon

I think he's just thinking about a longer timeline.

ares623

i.e. A prophet for profit

esafak

They're already making the robots to run these models, which are the complements they are commoditizing.

yahoozoo

This blog is unreadable

urbandw311er

This blog is excellent in general, particularly the AI posts. Yes they go extremely deep in places but the author freely admits that they are designed to be read selectively / skimmed in places.

yahoozoo

I mean it’s just a huge blob of text with emojis

black_puppydog

Jup, shame really. Just a little bit of better formatting and typesetting could really go a long way there. :)

jsnell

No space between paragraphs and blockquotes definitely hurts, and the subheading formatting is clearly not right. But the same content is posted on multiple sites. Maybe you'd find e.g. the Substack formatting more readable?

https://thezvi.substack.com/p/deepseek-v31-is-not-having-a-m...