LL3M: Large Language 3D Modelers

429 points 180 comments a day ago

nickparker

I've had surprising success with meshy.ai as part of a workflow to go from images my friends want to good 3D models. The workflow is

1. Have gpt5 or really any image model, midjourney retexture is also good, convert the original image to something closer to a matte rendered mesh, IE remove extraneous detail and any transparency / other confusing volumetric effects

2. Throw it in meshy.ai image to 3D mode, select the best one or maybe return to 1 with a different simplified image style if I don't like the results

3. Pull it into blender and make whatever mods I want in mesh editing mode, eg specific fits and sizing to assemble with other stuff, add some asymmetry to an almost-symmetric thing because the model has strong symmetry priors and turning them off in the UI doesn't realllyyy turn them off, or model on top of the AI'd mesh to get a cleaner one for further processing.

The meshes are fairly OK structure wise, clearly some sort of marching cubes or perhaps dual contouring approach on top of a NeRF-ish generator.

I'm an extremely fast mechanical CAD user and a mediocre blender artist, so getting an AI starting point is quite handy to block out the overall shape and let me just do edits. EG a friend wants to recreate a particular statue of a human, tweaking some T-posed generic human model into the right pose and proportions would have taken me "more hours than I'm willing to give him for this" ie I wouldn't have done it, but with this workflow it was 5 minutes of AI and then an hour of fussing in Blender to go from the solid model to the curvilinear wireframe style of the original statue.

QuantumNomad_

> 1. […] convert the original image to something closer to a matte rendered mesh […]

Sounds interesting. Do you have any example images like that you could share? I understand the part about making transparent surfaces not transparent. But I’m not sure how the whole image looks like after this step.

Also, would you be willing to share the prompt you type to achieve this?

nickparker

It works if you just plainly describe what you're looking for, I write a new prompt for different images just like "re-render this as a matte untextured 3d model, remove all details except geometric form"

menzoic

GPT-5 is a text only model. ChatGPT uses 4o for images still

mattnewton

The naming is very confusing. I thought the underlying model was gpt image 1 in the api but transparently shown as part of the same chat model in the UI?

Etherlord87

As someone using Blender for ~7 years, with over 1000 answers on Blender Stack Exchange and total score of 48.000:

This tool is maybe useful if you want to learn Python, in particular Blender Python API basics, I don't really see other usage of this. All examples given are extremely simple to do; please don't use a tool like this, because it takes your prompt and generates the most bland version of it possible. It really takes only about a day to go through some tutorials and learn how to make models like these in Blender, with solid color or some basic textures. The other thousands of days is what you would spend on creating correct topology, making an armature, animating, making more advanced shaders, creating parametric geometry nodes setups... But simple models like these you can create effortlessly, and those will be YOUR models, the way (roughly, of course) how you imagined them. After a few weeks you're probably going to model them faster than the time it takes for prompt engineering. By that time your imagination, skill in Blender and understanding of 3D technicalities will improve, and it will keep improving moving onward. And what will you learn using this AI?

I think meshy.ai is much more promising, but still I think I'd only consider using it if I wanted to convert photo/render into a mesh with a texture properly positioned onto it, to then refine the mesh by sculpting - and sculpting is one of my weakest skills in Blender. BTW I made a test showcasing how meshy.ai works: https://blender.stackexchange.com/a/319797/60486

elif

As someone who has tried to go through blender tutorials for multiple days, I can tell you, there is no chance I can get close to any of these examples.

I think you might be projecting your abilities a bit too much.

As someone who wants to make and use 3d models, not someone who wants to be a 3d model artist, this tech is insanely useful.

spookie

I'm surprised if anything.

All examples are really just primitives either extruded in one step or the same and maybe 5 of them together.

I don't want to sound mean but these are reachable with just another day at it. They really are.

leviathant

>I don't want to sound mean but these are reachable with just another day at it. They really are.

Semi-related, understanding Sketchup took a couple of false starts for me. The first time I tried it, I could not make heads or tails of what I was doing. I must have spent hours trying to figure out how to model a desk, and I gave up. Tried again a year or two later, and it just didn't click.

The third try, a couple years later, it suddenly made sense, and what started as modeling a new desk out turned into modeling my room, then modeling the whole house, and now I've got a rough model of my neighborhood. And it's so easy, once you know how - there's obviously a rabbit hole of detail work one can fall down into, but the basics aren't bad.

charcircuit

This is like for 2d art saying line art is just using the pen tool. Sure anyone can reproduce a single stroke, but figuring out what strokes to make has such a high skill ceiling.

spookie

No, the meshes involved are in the same ballpark as children's drawings for 2D art.

I'm sure the most difficult part here is just understanding blender UI. Clearly more difficult than picking up a pencil. But, a tutorial video should suffice.

For the chair example you pick a face on the default cube and the use the extrude tool on the left. Now you have a base.

Add 4 more cubes, and do the same. Now you have legs.

Then boolean them.

For the hat? Use a sphere, go to the sculpt tab and go ham.

There are way better ways to do this, of course. But really, there is not such a high degree of skill involved here, nor that being just a little more patient (one more day of trying) is that much to ask.

charcircuit

My point is that learning to use the tool is not the part people struggle with. The opened ended nature of creation is what is actually hard. Sure it may be primitives, but figuring out what primitives are needed, what dimensions they need, and where they should go is not easy. Everytime I attempt sculpting whatever I do turns into an abomination. That's what happens when I go ham. Not everyone with 1 day of practice are going to be perfectly able destruct what they have in their mind for what they want into parts to create or steps they need to do to get it look right.

echelon

0.0001% of the population can sculpt 3D and leverage complex 3D toolchains. The rest of us (80% or whatever - the market will be big) don't want to touch those systems. We don't have the time, patience, or energy for it, yet we'd love to have custom 3D games and content quickly and easily. For all sorts of use cases.

But that misses the fact that this is only the beginning. These models will soon generate entire worlds. They will eventually surpass human modeller capabilities and they'll deliver stunning results in 1/100,000th the time. From an idea, photo, or video. And easy to mold, like clay. With just a few words, a click, or a tap.

Blender's days are long in the tooth.

I'm short on Blender, Houdini, Unreal Engine, Godot, and the like. That entire industry is going to be reinvented from scratch and look nothing like what exists today.

That said, companies like CSM, Tripo, and Meshy are probably not the right solutions. They feel like steam-powered horses.

Something like Genie, but not from Google.

catlifeonmars

> These models will soon generate entire worlds. They will eventually surpass human modeller capabilities and they'll deliver stunning results in 1/100,000th the time. From an idea, photo, or video. And easy to mold, like clay. With just a few words, a click, or a tap.

This is a pretty sweeping and unqualified claim. Are you sure you’re not just trying to sell snake oil?

weregiraffe

I'm sure he is just trying to sell snake oil.

echelon

I've been predicting this since Deep Dream (which feels like a century ago) and HN loves to naysay.

I claimed three years ago that AI would totally disrupt the porn and film industries and we're practically on the cusp of it.

If you can't see how these models work and can't predict how they can be used to build amazing things, then that's on you. I have no reason to lift up anybody that doubts. More opportunity on the table.

vrighter

on the cusp means nothing. We are on the cusp of agi, tesla autopilot, cryptocurrency taking over, achieving nuclear fusion, and a bunch of other things. Companies don't sell working products anymore, they sell products that are "on the cusp of working"

We have been on the cusp of some things for literal decades.

lelanthran

> I claimed three years ago that AI would totally disrupt the porn and film industries and we're practically on the cusp of it.

Meh. We were on the cusp 5 years ago. Five years later, we're still on the cusp?

Maybe I'm working with a different meaning of "cusp", but to me "On the cusp of $FOO" means that there is no intervening step between now and $FOO.

The reality is that there are uncountable intervening steps between now and "film industry disrupted".

bigyabai

FWIW I'm a 3D modeller (hard surface Blender modelling, ~10yrs) and I've been reading your comments for a while now. Reality wasn't disrupted quite as far as you suggested, most of the naysayers that advised restraint under your comments have largely been proven right. Time and time again, you made enormous claims and then refused to back them up with evidence or technical explanations. We waited just like you asked, and the piper still isn't paid.

Have you ever asked yourself why this revolution hasn't come yet? Why we're still "on the cusp" of it all? Because you can't push a button and generate better pornography than what two people can make with a VHS camera and some privacy. The platonic ideal of pornography and music and film and roleplaying video games and podcasting is already occupied by their human equivalent. The benchmark of quality in every artistic application of AI is inherently human, flawed, biased and petty. It isn't possible to commoditize human art with AI art unless there's a human element to it, no matter how good the AI gets.

There's merit to discussing the technical impetus for improvement (which I'm always interested in discussing), but the dependent variables here seem exclusively social; humanity simply might never have a Beatlemania for AI-generated content.

I don't work in the field but I observe it pretty closely and my feeling is that comments like this remind me of the people I spoke to in the 1990s who said that Windows and Intel would never replace their Unix workstations.

Right now if I go on LinkedIn most header images on people's posts are AI generated. On video posts on LinkedIn that's a lot less, but we are beginning to see it now. The static image transition has taken maybe 3 years? The video transition will probably take about the same.

There's a set of content where people care about the human content of art, but there is a lot of content where people just don't care.

The thing is that there is a lot of money in generating this content. That money drives tool improvement and those improved tools increase accessibility.

> Have you ever asked yourself why this revolution hasn't come yet?

We are in the middle of the revolution which makes it hard to see.

echelon

I hope the walls don't cave in on you. Eyes up. My friends in VFX are adopting AI workflows and they say that it's essential.

> Why OnlyFans May Sell for 75% Less Than It’s Worth [1, 2]

> Netflix uses AI effects for first time to cut costs [3]

Look at all of the jobs Netflix has posted for AI content production [4].

> Gabe Newell says AI is a 'significant technology transition' on a par with the emergence of computers or the internet, and will be 'a cheat code for people who want to take advantage of it' [5]

Jeffrey Katzenberg, the cofounder of DreamWorks [6]:

> "Well, the good old days when, you know, I made an animated movie, it took 500 artists five years to make a world-class animated movie," he said. "I don't think it will take 10% of that three years out from now," he added.

I can keep finding no shortage of sources, but I don't want to waste my time.

I've brushed shoulders with the C-suite at Disney and Pixar and talked at length about this with them. This world is absolutely changing.

The best evidence is what you can already see.

[1] https://www.theinformation.com/articles/onlyfans-may-sell-75...

[2] https://archive.is/Xndzx

[3] https://www.bbc.com/news/articles/c9vr4rymlw9o

[4] https://explore.jobs.netflix.net/careers?query=Machine%20Lea...

[5] https://www.pcgamer.com/software/ai/gabe-newell-says-ai-is-a...

[6] https://www.yahoo.com/entertainment/cofounder-dreamworks-say...

topato

Frankly, that is all just speculative, once again. AI is hitting a significant roadblock. Look at how disappointing GPT-5 was. No amount of compute is ever going to match the hype matching those quotes.

The C-suite who don't realize how wrong they are about AIs potential are going to be facing a harsh reality. And artists will be the first to be hurt by their HYPE TRAIN management style and mindset.

Edit: most of all, the 3d generation in this LLM3d model is about the same as the genAI 3d models from a year ago... And two years ago... A good counterpoint would be Tubi's recently released, mostly AI gen short films. They were garbage and looked like garbage.

Netflix's foray, of memory serves, was a single scene where a building collapses. Hardly industry shattering. And 3d modeling and genAI images/videos are substantially different.

mlinhares

The only consequence they will be facing is being parachuted off with bootloads of money after they have failed to deliver on their magical promises.

imtringued

Your prediction compresses 24 hours into a single second or a single day of work into a third of a second. How exactly do you expect to be proven right when just the network latency alone will eat a big chunk of that time?

You'll literally be proven wrong simply because the AI will take time to generate things even if the quality of the output is high.

weregiraffe

> practically on the cusp of it.

Two Girls One Cusp.

_0ffh

> More opportunity on the table.

Hate to disappoint you, but as the models get better, and eventually deliver the results, you won't have to wait a microsecond until the masses roll in to take advantage.

mxmilkiib

Blender will just add AI creation/editing

ilaksh

There are probably already a bunch of Blender Add-Ons or extensions that build with AI that are in the approval queue and just being ignored. https://extensions.blender.org/approval-queue/

echelon

Why bolt magic rocket boosters onto a horse?

That's like saying we'll add internet browsing and YouTube to Lotus 1-2-3 for DOS.

It's weird, legacy software for a dying modality.

Where we're going, we don't need geometry node editing. The diffusion models understand the physics of optics better than our hand-written math. And they can express it across a wide variety of artistic styles that aren't even photoreal.

The future of 3D probably looks like the video game "Tiny Glade" mixed with Midjourney and Google Genie. And since it'll become so much more accessible, I think we'll probably wind up blending the act of creation with the act of consuming. Nothing like Blender.

HappyPanacea

> The diffusion models understand the physics of optics better than our hand-written math.

How well will they do with something like creating two adjacent mirrors with 30 degree angle between them with one of them covered with varying-polarized red tinted checkerboard pattern?

echelon

I don't know. Diffusion is weird and a little uneven.

They do a better job of fluid sim than most human efforts to date. And that's just one of thousands of things they do better than our math.

topato

Haha, despite the 1000s of instances where it DOESNT simulate correctly. Very specifically chosen 10000-shot generated videos show somewhat impressive fluid physics... And even then, it MUST be something it's seen before. Diffusion is in NO way modeling physics in a realistic matter, there is not an infinite amount of training data to show all fluid dynamics...

Now I know you're too far down the hype rabbit hole. Either that, or you lack a cursory understanding of diffusion models.

jayd16

I think you'll come to realize that the margin between people willing to learn blender today and people looking to generate models but won't learn how today is razor thin.

What's the use case of generating a model if all modelling and game engines are gone?

numpad0

All these pro-AI framings hinge on the fact that they can't tell apart AI data from human arts. That's like saying that because they don't know what improper bounds check is the code must be secure. It's just all broken logic.

echelon

> you'll come to realize

No. The Roblox of this space is going to make billions of dollars.

There's going to be so much money to make here.

topato

So, an ai generated psuedo-game engine with a majority of users under the age of 13? I'm sure that WILL make a lot of money. Those of us who didn't grow up playing Roblox will find this comparison impossibly stupid.

Some what related: im still amazed that no one has made a Roblox competitor, as in, a vague social building game that tricks children into wasting money on ridiculous MTXs. Maybe you are right, but I think that taking an already sorry state of affairs, and then removing the only imagination or STEM skills required by giving children access to GenAI.... is a really depressing thought.

I kinda meandered with my point lol.

x-complexity

> So, an ai generated psuedo-game engine with a majority of users under the age of 13? I'm sure that WILL make a lot of money. Those of us who didn't grow up playing Roblox will find this comparison impossibly stupid.

> ...with a majority of users under the age of 13? I'm sure that WILL make a lot of money. > ... will find this comparison impossibly stupid.

I'm ignoring the insinuations here for obvious reasons.

1. Roblox is the newest (note: not necessarily the best) iteration of the genre that Secondlife & (to a limited extent) modded Minecraft servers occupy: An interactive 3D platform that permits user-generated content.

2. Generative models just accelerate their development up to the brick wall of complexity much faster.

> Some what related: im still amazed that no one has made a Roblox competitor

This comment is just the HN Dropbox phenomenon, *again*, only this time from the angle that thinks it's easy to build a "pseudo game-engine" from scratch.

https://news.ycombinator.com/item?id=8863

Few competitors exist because of the moat they have built in making their platform easy to develop on, so much so that kids can use them with little issue.

> , as in, a vague social building game that tricks children into wasting money on ridiculous MTXs.

This part is entirely separate from the technical aspects of the platform. Roblox is a feces-covered silver bar, but the silver bar (their game platform) still exists.

> Maybe you are right, but I think that taking an already sorry state of affairs, and then removing the only imagination or STEM skills required by giving children access to GenAI.... is a really depressing thought.

This is a hyper-nihilistic opinion on children laid bare.

To think that the children (*with the dedication to make a game in the first place*) wouldn't try to learn about debugging the code that the models are spitting out, or that 100% of them would just stop writing their own code entirely, is a cynical viewpoint not worth any attention.

HappyPanacea

>What's the use case of generating a model if all modelling and game engines are gone?

Because using LL3M-style technique will probably be cheaper and better (fidelity and consistency and art direction wise) than generating the entire video/game with video generation model.

fwip

> These models will soon generate entire worlds.

They may. It's hard to expect this when we already see LLMs plateauing at their current abilities. Nothing you've said is certain.

d0100

AI will just be cheaper procedural environments

weregiraffe

> That entire industry is going to be reinvented from scratch

Hey, I heard that one before! The entire financial industry was supposed to have been reinvented from scratch by crypto.

pomtato

Well it kinda did change things up a bit. Me being able to receive payments across borders without significant delay or crazy fees is a decent perk, you can hate crypto culture and grifters trying to make a quick buck but it's applications are very real.

ares623

Won’t you get taxed on “gains”when you do that and then eventually convert to fiat?

I was considering this path a few years ago but all my research pointed to me being taxed for moving my own money from one country to another. Which would’ve cost significantly more than a good ol’ bank transfer. (I needed the fiat on the other end)

My understanding was that as far as the receiving bank is concerned, the converted crypto would’ve appeared out of an investment/trading platform and needed to be taxed

The bank transfer cost like a couple of bucks anyway so it wasn’t worth the risk of trying the crypto route in the end for me.

charcircuit

If you use stablecoins there will be no gains or losses.

srid

This reminds me of Elon Musk's recent claims on the future of gaming:

    This – but in real-time – is the future of gaming and all media

https://x.com/elonmusk/status/1954486538630476111

darepublic

Just don't slap a release year on this future and I'll be compelled to agree

Etherlord87

The only sculpting example I see is the very first hat. Do you want to tell me you wouldn't be able to sculpt that?

I perfectly understand the time/patience/energy argument and my bias here. But even Spore (video game) editor with all its limitations gives you a similar result to the examples provided, and at least there you are the one giving the shape to your work, which gives you more control, and your art more soul, and moreover puts you on a creative path where the results are getting better.

Will the AI soon surpass human modeller? I don't know... I hear so much hype for AI, I have fallen victim to it myself where I spent quite some time trying to use AI for some serious work and guess what - it works as a search engine, it will give me a ffmpeg command that I could duckduckgo anyway, it will give me an Autohotkey script that I could figure out myself after a quick search etc. The LLM fails even at the tasks that seem optimal for it - I have tried multiple times to translate movie subtitles with it, and while the translation was better than using machine learning, at some point the AI goes crazy and decides to change the order of scenes in a movie - something that I couldn't detect until I watched the movie with friends, so it was a critical failure. I described a word, and the AI failed to give me the word I couldn't remember, and a simple search on thesaurus succeeded instead. I described what I remembered from a quote, but the AI failed to give me the quote, but my googlefu was enough to find it.

You probably know how to code, and would cringe if someone suggested to you to just ask the AI to write you the code for a video game without you yourself knowing how to code to at least supervise and correct it, and yet you think the 3D modelling will be good enough without intervention of a 3D artist; maybe, but as someone experienced in 3D I just don't see it, just like I don't see AI making Hollywood movies even though a lot of people claim it's a matter of years before that becomes the reality.

Instead what I see is AI slop everywhere and I'm sure video games will be filled with AI crap, just like a lot of places were filled with machine-learning translations because Google seriously suggested on its conferences that the translations are good enough (and if someone speaks only English, the Dunning-Kruger effect kicks in).

Sure, eventually we might have AGI and humanity will be obsolete. But I'm not a fan of extrapolating hyperbolic data; one Youtuber made an estimation that in a couple decades Earth will be visited by aliens, because there won't be enough Earthlings to satisfy his channel viewership stats.

numpad0

100% of population has all the tools needed + ChatGPT for free to write a novel. Only 0.0001% are even able to complete even a short story - they often can't hold a complete and consistent plot in their head.

"AI allows those excluded from the guild" is total BS.

Gut figures, ~85% of creativity comes from skill itself. ~10% or so comes from prior arts. And it's all multiplied by willingness[0, 1] which >99.9999% of population has << 0.0001 as the value. Tools just don't change that, it just weighs down on the creativity part.

Etherlord87

Wrong tutorials. A lot of these models consist of just taking a primitive like a sphere, scaling it, and then creating another primitive, scaling it, moving it, so you have overlapping hulls ("bad" topology). Then in shading you just create a default material and set its color.

There are models in the examples that require e.g. extrusion (which is literally: select faces, press E, drag mouse).

Some shapes are smoothed/subdivided with Catmul-Clark Subdivision Surface modifier, which you can add simply by pressing CTRL+2 in "Object Mode" (the digit is the number of subdivisions, basically use 1 to 3, you may set more for renders).

Here's a good, albeit old tutorial: https://www.youtube.com/watch?v=1jHUY3qoBu8

Yes I made some assumptions when estimating it takes about a day to learn to make models like this: you have a free day to spend it in its entirety to learn, and as a hackernews user your IQ is over average and you're technically savvy. And last assumption: you learn skills required evenly, rather than going deep into the rabbit hole of e.g. correct topology; if you go through something like Andrew Pierce's doughnut tutorial, it may take more than a day, especially if you play around with the various functions of Blender rather than strictly following the videos - but you will end up making significantly better models than the examples presented, e.g. you will know to inset cylinder's ngons to avoid the Catmul-Clark subdiv artifacts you can see on the 2nd column of hats.

> this tech is insanely useful.

No, it isn't, but you don't see it, because you don't have enough experience to see it (Dunning-Kruger effect) - this is why I mentioned my experience, not to flex but to point out I have the experience required to estimate the value of this tool.

dang

> No, it isn't, but you don't see it, because you don't have enough experience to see it (Dunning-Kruger effect)

That crosses into personal attack. Please don't do this. You can make your substantive points without it.

https://news.ycombinator.com/newsguidelines.html

xtracto

It's amazing how little understanding some people with "a gift" for certain skills have.

I play guitar, it's easy and I enjoy it a lot. I've taught plsome friends to play it, and some of them just... don't have it in them.

Similarly,.I've always liked drawing/painting and 3d modeling. But for some reason, that part of my brain is Just not there. I just can't do visualization. I've even tried award winning books (drawing with the right side of the brain) without success.

Way back in the day I tried 3D modeling with AW maya, 3d studio max and then Blender. I WANT to convert a sphere into a nice warrior, I died to make 3d games: I had the C/C++ part covered, as well as the opengl one. But I couldn't model a trash can,.after following all tutorials and.books.

This technology solves that for us who don't have that gift. I understand that for people that can "draw the rest of the fking owl" it won't look as much, but darn, it opens a world for me.

maplethorpe

I'm similar, honestly. I've spent countless hours trying to become a good drawer and a good 3D modeler, but I lack the ability to see something clearly in my mind's eye, and it feels like it's always held me back.

The thing is, I've actually worked as a 3D artist for a number of years. Some people even tell me I'm good. I suppose if that's true at all, it's because I've learned to use the computer to do the visualizing for me.

For some other artists, their process seems to be that they first picture a 'target' image in their mind, and then take steps towards that target until the target is reached. That seems impossible to me -- supernatural stuff. I almost don't believe they can really do it.

My process is closer to first finding some reference images, then taking a step in a random direction and asking whether I'm closer or further away from those references. I'm not necessarily trying to copy the references exactly, I'm just trying to match their level of quality. Then I take another random step, and check again. If you repeat this process enough times, you'll edge closer and closer to something that looks good. You'll also develop a vague sense of 'taste', around which random movements tend to produce more favourable results, and which random movements tend to produce more ugly results. It's a painful process, but it's doable.

I guess what I'm trying to say is that the ability to visualize isn't a prerequisite for 3D modeling.

Etherlord87

I can agree with this. If someone has some kind of disability, like aphantasia (I don't know if it really applies here, as you can look at a reference image) then perhaps the tool is useful. The thing is, none of the examples presented in this particular AI tool are stuff that require hard 3D-related skills e.g. knowing human anatomy.

I wish I could see you struggling to model a trash can and see if maybe you didn't have too high requirements for the quality of said trash can. After all it's just taking a cylinder, insetting the top face and extruding it down, and the top you can model in the exact same way. The rest is detail that the AI tool in question is terrible at. https://i.imgur.com/xeFrgpP.gif

xtracto

Hah! you should have seen me "drawing" a coffee cup that was in front of me at a drawing class: The cup was sitting there, I was seeing it and supposedly I was drawing what I saw. The teacher came and told me: Squint your eyes, draw "lights and shadows". Theoretically, I did that, but my cup just didn't look like the others haha.

The teacher then asked me for my pencil, and started doing some adjustments in my drawing. The shitty cup just became alive with some touches here and there. All I could ask was HOW ??? how did she SEE that?

The book "drawing with the right side of the brain" goes over it: A lot of who are strongly (brain) left-sided see a Cup and "abstract" away the forms, we are constatly drawing "lines" (like, drawing a sticky-figure person,a head is a circle, then body is a line, girl skirt is a triangle, etc) and just cannot actually get past that reasoning in our brain.

Etherlord87

I remember getting the same piece of advice from the teacher. Problem is, even before getting it, I was already applying it, being a rare kid experienced in computer graphics. The teacher was just repeating a phrase she heard somewhere, without actual competence to direct me.

The way I see, and I think the way most people see, is that I have subpixels, not distributed in a square grid and small enough, too many to be able to count them - but I can see them when I close my eyes, it's somewhat similar to looking at a colored noise - something like this: https://i.imgur.com/1P3n80k.gif except you would have to display it on a ridiculously high resolution display (I don't know, 64k or maybe more) and it would represent just a small fragment of view.

Of course this unordered constellation of cones can be mapped into a grid of pixel or a space on a paper, so the only problem is I can't make a measurement in my head and I need to calibrate "eye-balling" measurement to figure out where on paper should I put what I see and I deal with it typically by imagining vertical and horizontal lines to subdivide my view, and then I likewise subdivide the paper.

So I don't really have a problem drawing what I see, the problem I have is the missing technique of how to use a pencil to draw what I actually want to draw.

I think most people work the same way but apparently you don't?

cthlee

Most of these 3d asset generation tools oversimplify things down to stacking primitives and call it modeling, which skips fundamentals like extrusion, subdivision, and proper topology. If they wanted to make a tool actually worthwhile, what do you think the core features should be? Like it would be great if it enforces clean topology, streamline subdivision workflows, but given your xp I'm curious what you’d consider essential.

Etherlord87

I could probably write a book to answer this question :D Blender has gone "everything nodes" route, and in particular it created the Geometry Nodes system. I'm very good in geonodes and pretty much specialize in it, and yet I think the system is severly flawed: the nodes in compositing and shading work very well, but in geonodes they are too low-level, and you don't get the easy learning curve of usual node systems (learning geonodes is very hard), while you get all the annoyances of using a very shoddy programming language and having to manage node positioning and fighting with noodles...

...And here's where AI comes into play, If AI could be contained into steps: - Input node: describe where the starting data comes from and AI automatically loads a file from hard drive or Internet or generates a primitive - Select node: describe a pattern by which to select elements of the geometry - Modify Geometry node: perhaps should be split into multiple nodes as there's so many ways to modify the geometry. - Sample/connect data: create an attribute and describe a relation of it to something else to create an underlying algorithm populating this attribute. - Save node: do you want to output the data through the usual pipeline, or maybe export to a file, or save to a simulation cache?

This way AI could do low-level stuff that I think it excels at, because this low-level stuff is so repeatable AI can be well trained on it. While the high-level decision making would be in control of an artist.

quikoa

These models could still be useful when rendered. But when animated or in a game probably less so. Maybe as a prototype to get funding and hire an artist.

jdiff

There are countless troves of CC-licensed assets that would be better suited.

Etherlord87

I'd like to see an AI trained to search for an asset!

tarr11

One of my hobbies is Houdini which is like Blender. While I agree with you that you can build a nice parameterised model in a few days - if you want to make an entire scene or a short film, you will need hundreds if not thousands of models, all textured and topolgized and many of them rigged, animated or even have simulations.

What this means is that making even a 2 minute short animation is out of reach for a solo artist. Your only option today is to go buy an asset pack and do your best. But then of course your art will look like the asset pack.

AI Tools like this reduce one of the 20+ stages down to something reachable by someone working solo.

thwarted

> What this means is that making even a 2 minute short animation is out of reach for a solo artist.

Is it truly the duration of the result that consumes effort and the number of people required? What is the threshold for a solo artist? Is it expected that a 2 minute short takes half as much effort/people as a 4 minute short? Does the effort/people scale linearly, geometrically, or exponentially with the duration? Does a 2 minute short of a two entity dialog take the same as a 4 minute short of a monologue?

> Your only option today is to go buy an asset pack and do your best. But then of course your art will look like the asset pack.

What's more valuable? That you can create a 2 minute short solo or that all the assets don't look like they came from an asset pack? The examples shown in TFA look like they were procedurally generated, and customizations beyond the simple "add more vertexes" are going to take time to get a truly unique style.

> AI Tools like this reduce one of the 20+ stages down to something reachable by someone working solo.

To what end? Who's the audience for the 2 minute short by a solo developer? Is it meant to show friends? Post to social media as a meme? Add to a portfolio to get a job? Does something created by skipping a large portion of the 20+ steps truly demonstrate the person's ability, skill, or experience?

latexr

> Your only option today is to go buy an asset pack and do your best.

There is a real possibility the assets generated by these tools will look equally or even more generic, the same generated images today are full of tells.

> What this means is that making even a 2 minute short animation is out of reach for a solo artist.

Flatland was animated and edited by a single person. In 2007. It’s a good movie. Granted, the characters are geometric shapes, but still it’s a 90 minute 3D movie.

https://en.wikipedia.org/wiki/Flatland_(2007_Ehlinger_film)

Puparia is a gorgeous 2D animated film done by a single person in 2020.

https://en.wikipedia.org/wiki/Puparia

These are exceptional cases (by definition, as there aren’t that many of them), but do not underestimate solo artists and the power of passion and resilience.

Ey7NFZ3P0nzAe

There are always exceptions. I think the parent is refering to the many solo artists that would almost be able to make such great movies if not for some of the time constraints or life event etc. I'm sure there are countless solo artists that made 75% of a great movie then lacked time for unforeseeable reasons. Making the creation a bit easier allows much more solo artists to create!

oblio

Puparia is a 3 minute short film that took a veteran artist 3 years to make. I think you're making OP's point.

jimmis

As a designer/dev working on AI for customer service tools, who has to constantly reminding stakeholders that LLMs aren't creative, aren't good at steering conversations, etc. I wish there was more focus on integrating AI into tools in ways that make work faster, rather than trying to do-it-all. There's still so much low-hanging fruit out there.

Other than the obvious (IDEs), wish there were more tools like Fusion360's ai auto-constraints. Saves so much time on something that is mostly tedious and uncreative. I could see similar integrations for Blender (honestly the most interesting part of what op posted is changing the materials... could save a lot of time spent connecting noodles).

sbuk

Tedious tasks, like retopologising, UV unwrapping and rigging would be great examples of where AI in tools like Maya and Blender could be really useful.

lostdog

There is not that much available data for these, so it will take new techniques to get AIs that are truly good.

dash2

What if I don't want to spend a few weeks learning Blender? What if I just want to spend a couple of hours and get something that's good enough?

ghurtado

If you think the results in that page are "good enough", then I assure you, as a heavy blender and Gen AI user, that it would take you much less time to get this good at blender (about Logo turtle level) than it would to figure out how to run this model yourself locally, with all the headaches attached.

Without question.

spiralcoaster

Sounds like you're looking for something like an asset store, or open source models.

raincole

The best thing you can do is to just make money. I am serious.

The current 3D GenAI isn't that good. And if when they eventually become good enough they won't be very cheap to run locally, at least for quite a while. You just need to wait & hoard spare cash. Learning how to use the current models is like trying to get GPT1 to write code for you.

numpad0

Then you don't get any positive recognitions for the product anyway.

exasperaited

... is like...

What if I don't want to learn guitar? What if I just want to spend a couple of hours and get something that sounds like guitar?

I tend to say in this situation: you can do that. Nobody's stopping you. But you shouldn't expect wider culture to treat you like you've done the work. So what new creative work are you seeking to do with the time you've saved?

dash2

I just want to make a fun 3d model of my dog! It's like complaining I haven't taken a professional photography course when I snap photos of my kids.

sbarre

Yeah but remember this tool is, today, the worst it will ever be.

This kind of work will only improve, and we're in early days for these kinds of applications of LLM tech.

latexr

I wish that “it will get better” wasn’t the response every time someone shares actionable advice and thoughtful specific criticism about the state of the art.

You don’t know if it will get better. Even if it does, you don’t know by how much or the time frame. You don’t know if it will ever improve enough to overcome the current limitations. You don’t know if it will take years.

In the meantime, while someone is sitting on their ass for years waiting for the uncertain future of the tool getting better, someone else is getting their hands dirty, learning the craft, improving, having fun, collaborating, creating.

There is plenty of garbage out there where we were promised “it will only get better”, “in five years (eternally five years away) it will take over the world”, and now they’re dead. Where’s the metaverse NFT web3 future? Thrown into a trash can and lit on fire, replaced by the next embarrassment of chatting with porn versions of your step mom.

https://old.reddit.com/r/singularity/comments/1mrygl4/this_i...

sbarre

> You don’t know if it will get better. Even if it does, you don’t know by how much or the time frame. You don’t know if it will ever improve enough to overcome the current limitations. You don’t know if it will take years.

You are _technically_ correct but if I base my assumptions on the fact that almost all worthwhile software and technology has gotten better over the years, I feel pretty confident in standing behind that assumption.

> In the meantime, while someone is sitting on their ass for years waiting for the uncertain future of the tool getting better, someone else is getting their hands dirty, learning the craft, improving, having fun, collaborating, creating.

This is a pretty cynical take. We all decide where we prioritize our efforts and spend our time in life, and very few of us have the luxury to freely choose where we want to focus our learning.

While I wait for technologies I enjoy but haven't mastered to get better, I am certainly not "sitting on my ass".. I am dedicating my time to other necessary things like making a living or supporting my family.

In this specific case I wish I could spend hours and hours getting good at Blender and 3D modelling or animation. Dog knows I tried when I was younger.. But it wasn't in the cards.

I'm allowed to be excited at the prospect that technology advancements will make this more accessible and interesting for me to explore and enjoy with less time investment. I also want to "get my hands dirty, learn, improve, have fun, create" but on my own terms and in my own time.

Any objection to that is shitty gatekeeping.

spiralcoaster

Since you used the term "shitty gatekeeping": to me, your comment reads like the most generic kind of optimism you see everywhere on the internet about everything ever. Shitty optimism.

No one told you weren't allowed to be excited, but you took it that way anyway.

sbarre

Fair enough! My original comment was pretty generic and flippant. That's totally valid.

latexr

> almost all worthwhile software and technology has gotten better over the years

You only know if it was worthwhile in hindsight. We aren’t there yet. And “better” is definitely debatable. We certainly do more things with software these days, but it’s a hard sell to unambiguously say it is better. Subscriptions everywhere, required internet access, invasions of privacy left and right, automated rejections, lingering bugs which are never fixed…

Your exact argument was given by everyone selling every tech grift ever. Which is not to say this specific case is another grift, only that you cannot truly judge the long-term impact of something while it is being invented.

> Any objection to that is shitty gatekeeping.

If gatekeeping is what you took from my comment, you haven’t understood it. Which could certainly mean my explanation wasn’t thorough enough. My objection is to the hand-wavy “this will only improve” commentary which doesn’t truly say anything and never advances the discussion, yet is always there. See the “low-hanging fruit” section of my other comment.

https://news.ycombinator.com/item?id=44932526

sbarre

> You only know if it was worthwhile in hindsight.

Yes that's because, so far, we haven't been able to see the future. Which is why we base predictions and assumptions on past performance and on lived experience. Sometimes we will be wrong.

You're also arguing in the abstract here while I am speaking about this specific topic of using LLMs to improve 3D modelling tooling.

Are you arguing that neither LLMs or 3D modelling tools are "worthwhile"?

Are you suggesting that improvements that make these tools more accessible, even incrementally, are a bad thing?

Or are you just challenging my right to make assumptions and speculate? I realize that we may not be even on the same page here.

You're also cherry-picking a limited number of examples where software isn't better, and I agree with all those (and never said that software universally gets better), but those examples are a tiny subset of what software does in today's world.

It's starting to feel like you just want to argue.

My opinion is that, broadly speaking, software advancements have improved the world, and will continue to improve the world, both in big and small ways. There will of course be exceptions.

nativeit

The absolute nanosecond someone suggests using this over skilled labor, the “shitty gatekeeping” roles reverse. I personally care more about the skilled labor than your optimism.

sbarre

I am in no way implying that these advancements should/would make me better than a skilled professional, nor do I believe they will ever replace true expertise or craft.

These kinds of advancements lower the bar for entry and reduce the effort required to achieve better results, and so make the tools more accessible to more people.

And that is a good thing. The scenario you are implying is laying the choices of people at the feet of the tools, rather than holding the people accountable.

Also: history has, so far, mostly proven that the end result of better tools is better experts, not less experts.

latexr

> history has, so far, mostly proven that the end result of better tools is better experts

I’m inclined to agree with the general sentiment. However, it is not a given that LLMs are better tools. You don’t really get better at them in the same sense as before, you just type something and pray. The exact same thing you typed may produce exactly what you wanted, and aberration, or something close with subtle mistakes that you don’t have the expertise to fix. Other tools made you better at the craft in general.

sbarre

> You don’t really get better at them in the same sense as before, you just type something and pray.

I very much disagree with this. I've spent the last 18 months working with LLMs daily at my work (I'm not exaggerating here) and while the models themselves have certainly gotten better, I have personally learned how to extract better results from the tools and improved my proficiency at applying LLMs as part of my overall toolchain and workflow.

LLMs are very much like many other tools in some ways, that the more you learn how to use them, the better results you will get.

I do agree that their non-deterministic nature makes them less reliable in some contexts as well though, but that's a trade-off you work into your approach, just like other trade-offs.

soulofmischief

On the other hand, as both an artist and machine learning practitioner, I think most artists are only seeing the surface layer here and have little insight on its derivative, the algorithms and state of research which are advancing the state of the art on a weekly basis. It'll never be obvious that we're at the critical moment because critical phase changes happen all at once, suddenly, out of nowhere.

There is an insane amount of low-hanging fruit right now, and potentially decades or centuries of very important math to be worked out around optimal learning strategies, but it's clear that we do have a very strong likelihood of our ways of life being fundamentally altered by these technologies.

I mean already, artists are suddenly having to grip with all sorts of new and forgotten questions around artistic identity and integrity, what qualifies as art, who qualifies as an artist... Generative technology has already made artists begin to question and radically redefine the nature of art, and if it's good enough to do that, I think it's already worth serious consideration. These technologies, even in current form, were considered science fiction or literal magic up until very recently.

latexr

> I think most artists are only seeing the surface layer here and have little insight on its derivative (…)

> Generative technology has already made artists begin to question and radically redefine the nature of art (…)

So which is it? Are artist not understanding the potential of the technology, or are they so flabbergasted they are redefining the nature of art? It can’t be both.

> There is an insane amount of low-hanging fruit right now

Precisely. What I’m criticising is the generic low-effort response of assuming that being able to pick low-hanging fruit now indicates with certainty that the high-hanging fruit will be picked soon. It doesn’t. As an exaggerated analogy, building the first paper airplane or boat might’ve been fun and impressive, but it was in no way an indication we’d be able to construct rockets or submarines. We eventually did, but it took a very long time and completely different technology.

To really drive the point home, my comment wasn’t specifically about art or LLMs or the user I replied to. What I am against is the lazy hand-wavy extrapolation which is used to justify anything. As long as you say it’s five years away, you can shut down any interesting discussion.

> These technologies, even in current form, were considered science fiction or literal magic up until very recently.

I don’t recall science fiction or magical stories—not any that weren’t written as cautionary tales or about revolution, anyway—which had talking robots which were wrong most of the time yet spoke authoritatively; convinced people to inadvertently poison or kill themselves and others; were used for spam and disinformation on a large scale; and accelerated the concentration of power for the very few at the top. Not every science fiction is good. In fact, plenty of it is very very bad. There’s a reason the Torment Nexus is a meme.

soulofmischief

> So which is it? Are artist not understanding the potential of the technology, or are they so flabbergasted they are redefining the nature of art? It can’t be both.

Different people have different relationships with generative technologies and have different experiences. It can be the same at both times, because there are a lot of people out there, and there are many artists with both technical and artistic interests.

> What I’m criticising is the generic low-effort response of assuming that being able to pick low-hanging fruit now indicates with certainty that the high-hanging fruit will be picked soon.

I mean if we can have both more compute and more efficient compute due to those low-hanging fruits and Moore's law, it's reasonable to assume that things can get much better to the point of being economical, useful or even essential for a new generation of people, even if we don't see any massive shifts in the current paradigm.

> I don’t recall science fiction or magical stories—not any that weren’t written as cautionary tales or about revolution, anyway—which had talking robots which were wrong most of the time yet spoke authoritatively

Because people had an understandably naive understanding of what AI progress might look like, or the timescales involved.

> Not every science fiction is good. In fact, plenty of it is very very bad

I'm not sure what this has to do with anything. In general, I don't understand or appreciate your hostile tone and would like for this conversation to take a more positive tone or for us to just drop it.

parineum

> Generative technology has already made artists begin to question and radically redefine the nature of art,

Everyone else still knows what art is. The only reason artists grapple with it is because it's existential for them. Artists think of themselves as the gatekeepers of art but the only thing that qualified them for that (in their minds) was the ability to produce it.

Now that everyone is producing generic entry level art (what most artists do anyway), they are losing their identity. The "What is art?" conversation isn't about art, it's about gatekeeping.

soulofmischief

> Everyone else still knows what art is.

What is art?

> Artists think of themselves as the gatekeepers of art > The "What is art?" conversation isn't about art, it's about gatekeeping

Can you answer that question without yourself being a gatekeeper? Your post definitely comes off as judgemental and gatekeeping about what art is. But what if you're wrong? The people pondering what it may be certainly seem to have a more open mind, being more considerate of all the ways that expression and significance can be found.

jappgar

LLMs aren't suited to this, just like they aren't suited to generating images (different models do the hard work, even when you're using an LLM interface).

I agree with the parent comment. This might be neat to learn the basics of blender scripting, but it's an incredibly inefficient and clumsy way of making anything worthwhile.

sbarre

That's fair, and perhaps a different kind of multi-modal model will emerge that is better at learning and interacting with UIs..

Or maybe applications will develop new interfaces to meet LLMs in the middle, sort of how MCP servers are a very primitive version of that for APIs..

Future improvements don't just have to be a better version of exactly what it is today, it can certainly mean changing or combining approaches.

Leaving AI/LLM aside, 3D modeling and animation tech has drastically evolved over the years, removing the need for lots of manual and complicated work by automating or simplifying the workflow for achieving better results.

ghurtado

Right.

This is like training an AI on being an Excel expert, and then ask it to make Doom for you: You're gonna get some result, and it will be impressive given the constraints. It's also going to be pure dog shit that will never see the light of day other than as a meme.

ghurtado

> Yeah but remember this tool is, today, the worst it will ever be.

True. And it if it stops to be a path forward because a better approach is found (something more likely than not), then this is also the best it will ever be.

voxleone

It's a neat tool. But i suspect it can only ever be as good as Blender. Of course I could be wrong.

Etherlord87

Python is Turing-complete, so everything is possible, but the tool has a long way to go.

parineum

Hammers have existed for thousands of years and still can't do my laundry.

blakcod

Let’s be honest most are just looking to get acquired. Then enshittified.

btown

I think there’s a really interesting point here that even if a model is capable of planning and reasoning, part of the skillset of a creator is that asking for the right thing requires an understanding of how that model is creating its artifacts.

And in 3D, you won’t be able to do that without understanding, as an operator, what you’d want to ask for. Do you want this specific part to be made parametrically in a specific way for future flexibility, animation, or rendering? When? Why? And do these understandings of techniques give you creative ideas?

A model trained solely on the visual outcome won’t add these constraints unless you know to ask for them.

Even if future iterations of this technology become more advanced and can generate complex models, you need to develop a skillset to be able to gauge and plan around how they fit into your larger vision. And that skillset requires fundamentals.

myhf

The article is about language models. Language models are not capable of planning or reasoning.

Etherlord87

It depends on the definitions of these words. Basically, AI pretends to be a human, it pretends to think, it pretends to plan and reason. In a few last years we've pushed this pretending quite far.

aledalgrande

I don't know a lot about 3D modeling, but I can see that the objects created by this AI are way too high poly, which would be bad for performance if used e.g. in a game. But it still looks like a great prototyping tool to me, especially if you want to express an idea in your head to an actual 3D designer, in the same way UX designers can show a prototype to developers with Claude code now, instead of trying to repro an idea with Figma.

orbital-decay

Of course. It's a research paper, not a product.

overfeed

> I don't really see other usage of this

My hot-take: this is the future of high-fidelity prompt-based image generation, and not diffusion models. Cycles (or any other physically based renderer) is superior to diffusion models because it is not probabilistic, so scene generation via LLM before handing to off to a tool leads to superior results, IMO - at least for "realistic" outputs.

raincole

Of course no one knows the future, but I think it's very plausible that the future of films/games (especially games) tech resembles something like this:

1. Generation something that looks good in 2D latent space

2. Generation 3D representation from 2D

3. Next time the same scene is shown on the screen, reuse information from step 2 to guide step 1

overfeed

That's an interesting idea! I'm thinking step 2 might be inserting 3d foreground/hero objects in front of a 2d background/inside a 2d worldbox

ghurtado

> My hot-take: this is the future of high-fidelity prompt-based image generation and not diffusion models

Why are those two the only options?

overfeed

> Why are those two the only options?

I made no such claim. The only thing I declared is my belief in the superiority of PBR over diffusion models for a specific subset of image-generation tasks.

I also clearly framed this as my opinion, you are free to have yours.

ghurtado

> also clearly framed this as my opinion, you are free to have yours.

Yes, thank you for your generosity.

You very clearly framed your opinion as a dual choice ("this instead of that", "rather than", "either or"). The most natural way to read your comment is that way: one or the other.

That's the way the English language works. If you meant something else, you should have said something else.

overfeed

You may have missed my first 3 words that did the level-setting. Infact, the first word should have been enough to signal subjectivity, that's just how the english language works.

Hot take: a piece of commentary, typically produced quickly in response to a recent event, whose primary purpose is to attract attention

AndrewKemendo

I’m going to start calling responses like these “death rattles”

There’s something uniquely spirited about narrowly focused experts feeling threatened by “AI” and responding like this wherein some set of claims are made (“AI will never be able to [insert your speciality]” or “This can’t do the [specific edge case in an expert at]”)

except for the root claim:

“This threatens both my economic stability because it will shift my market, and my social status because I have devoted 100,000 hours to practicing and expertise so that I can differentiate myself because it’s part of my identity”

Etherlord87

> feeling threatened by “AI”

Why the quotes there? I do feel threatened by AI, but not economically, not in the context of this article at least.

> AI will never be able to

I said no such thing.

> This can’t do the

I also didn't said that, I didn't focus on any particular thing it can't do. In general I don't voice my opinion, I think here the urge was just too strong because the examples were just too crude and I had to say if you think those are impressive, you can literally do it after a day's worth of learning; and I only flexed my expertise to give some strength to that statement and motivate people to really try.

> “This threatens both my economic stability because it will shift my market, and my social status because I have devoted 100,000 hours to practicing and expertise so that I can differentiate myself because it’s part of my identity”

I'm an OSHA inspector/instructor. I don't think what I learned in 3D will go to waste even if AI will actually start creating good 3D models (I wouldn't bet against it).

nrjames

I was playing with Aseprite (pixel editor) the other day. You can script it with Lua, so I asked Claude to help me write scripts that would create different procedurally-generated characters each time they were run. They were reproducible with seeds and kind of resembled people, but very far from what I would consider to be high quality. It was a fun little project and easily accessible, though.

- https://www.aseprite.org

notimpotent

If you're interested in that, check out the guys over at pixellab.ai

They have an Aesprite plugin that generates pretty nice looking sprites from your prompts.

kleiba

I've been looking for a good pixel-art AI for a while. Most things I tried look okay, but not stunning. If anyone has had good experience with an AI tool for that, I'd be grateful for a link.

emporas

Very encouraging results. Spatial intelligence of LLMs was very bad, one year back. I spent quite some time to make them write stories in which objects are put into up and down positions, left and right, front or back, they always got hopelessly confused.

I asked GPT which one is the most scriptable CAD software, it's answer was Freecad. Blender is not a CAD software as far as I understand, the user cannot make measurements like Freecad.

Unfortunately Freecad's API is a little bit scattered and not well organized, GPT has trouble remembering/searching and retrieving the relevant functions. Blender is a lot more popular, more code on the internet, and it performs much better.

bravesoul2

OpenSCAD?

emporas

OpenSCAD doesn't seem to be bad at all, but not thrilled with a custom language to write scripts on it. I found a Rust crate which might help.

I haven't managed yet to write Rust and interoperate with FreeCAD or Blender, so if it is easier to write Rust for OpenSCAD I might settle on OpenSCAD. I have to experiment somewhat to find the most compatible with Rust between the three.

ThomPete

Would it be possible to write a script for CAD that could do measurements

emporas

Measurements are printed and given usually to construction workers, usually along some axis. People who lay the bricks take a top view from a house with the dimensions along the y axis. People who build the doors and windows take a side view of the dimensions along the x axis. And so on.

Blender cannot do that as far as I understand.

Something like that for example: [1]

[1] https://all3dp.com/2/freecad-2d-tutorial/

ThomPete

Looks like it can

emporas

I may start using Blender if that's the case. I was waiting for some kind of success modeling 3D shapes using code, and automatically generating the code with LLMs, for quite some time.

reactordev

Before you trash on the 3d model quality, just think about the dancing baby and early pixar animations. This is incredible. I can't wait to be able to prompt my llm to generate a near-ready 3d model that all I have to do is tweak, texture, bake, and export.

jappgar

LLMs are language models. Meshes aren't language. Yes this can create python to create simple objects, but that's not how anyone actually creates beautiful 3d art. Just like no one is handwriting svg files to create vector art.

LLMs alone will never make visual art. They can provide you an interface to other models, but that's not what this is.

rozab

This is of course true, but have you ever seen Inigo Quilez's SDF renderings? It's certainly not scalable, but it sure is interesting

https://www.youtube.com/watch?v=8--5LwHRhjk

margalabargala

That's fine. I'm happy to define "visual art" as things LLMs can't do, and use LLMs only for the 3d modelling tasks that are not "visual art".

Such tasks can be "not making visual art", but that doesn't mean they aren't useful.

reactordev

I know that, I was making a statement about how you can.

Not exactly sure what your point is. If an LLM can take an idea and spit out words, it can spit out instructions (just like we can with code) to generate meshes, or boids, or point clouds or whatever. Secondary stages would refine that into something usable and the artist would come in to refine, texture, bake, possibly animate, and export.

In fact, this paper is exactly that. Words as input, code to use with blender as output. We really just need a headless blender to spit it out as a GLTF and it’s good to go to second stage.

therouwboat

If you have an artist, can't you just talk to her about what you want and then she makes the model and all the rest of it? I don't really understand what you gain if you pay for LLM, make model with it and then give it to artist.

reactordev

If you knew how an art pipeline worked, you would. An artist is usually one in an array of artists that completes a model. The pipeline starts with concept artists (easily AI now), turnaround (AI is hit or miss here), modeling (this phase), texturing (could be another artist), baking (normally a texture artists job but depending on material complexity and whether it’s for film, technical artist), rendering if needed.

Then you have sub specialties. Rigging, animation, texturing, environments, props, characters, effects.

It’s a fascinating process.

martin-t

I too can't wait until every last bit of experience gained through trillions of hours of human experimentation is compiled into statistical models and monetized without ever paying a cent to the people who made it possible.

vessenes

Notable here I think is the agent workflow - as LLMs continue to have increased 3d world understanding, they’re going to be useful in a variety of circumstances. Taking the human out of the loop for error checking and bug fixing is going to be useful, even if its just a background process that experts like Etherlord87 in this thread gain a little bug fixing / suggesting / pop up help from. Meanwhile being able to programmatically instrument this stuff is super useful, and will keep getting more useful.

ThouYS

large token models are coming for everything, because everything can be made a token.

the detour via language here is not needed, these models can speak geometry more and more fluently

numpad0

This seem like a great observation, a lot of negative reactions to AI generated data seem to come from limitations of using language, thereby denying good creative input from its users.

Stevvo

Right, like word2vec blew everyone's mind back in the day, but 3D models have always existed in a "vector space".

merksoftworks

This is the direction i've been theoretically harping about to all my friends, API first creative software will win out. After Effects has a decent JS api, Da Vinci Resolve has python and lua, you can script your way to a decent starting point with any of these tools. They even have well featured rollback for transactions committed during scripting. We need a generic MCP for the scripting environments in many desktop apps, screen capture for ones which take multimodal input.

rambojohnson

This looks like AI trash. Anyone who works in 3D can spot it instantly. The models are riddled with choices no sane artist would ever make, on top of churning out a grotesque polygon count that only reinforces how sloppy it is.

panzi

Does that run LLM generated Python code in Blender on your machine? That alone is a no for me. Python is not sandboxed, it can just read or delete all your files.

keyle

This is at "cute" level of useful I feel. A few more iterations though and this will get interesting.

swinglock

Looks like a fun toy. That can already be useful. I'm thinking of games that don't even have to leave a prototype stage, e.g. Roblox, or actually just for better prototyping. Even if it can't produce anything sufficiently good yet (depends on the game and audience, look at Minecraft), if it's fun tinker with that's enough to be useful. If it improves that will certainly be more exciting but it already looks useful.

king_terry

This is amazing. Solo game dev will actually become solo.

nativeit

I’ve been saying for a long time, “gaming is just too pro-social and well adjusted, we need more isolation and introversion in gaming!”

bluefirebrand

I hope we adopt some kind of "This product uses AI generated content" label for games so I can avoid them forever :)

Workaccount2

That's when people just lie about using AI. The thing about AI is that it is only obvious when it is obvious.

bluefirebrand

So I guess we just accept that everything is bad forever then because too many people are just happy to lie and cheat and there's no mechanisms in society to deal with it

hhh

Steam has this

jellybaby2

Haaa you sound jealous :-) Pity you’re not smart enough to benefit

aussiegreenie

I have tried to use Blender and have given up several times. My only constant use of Blend is creating animated titles in Ophenshot.

Anything that simplifies using advanced tools is useful.

nperez

I'm not a modeler but I've tried it a few times. For me, modeling is a pain that I need to deal with to solo-dev a 3d game project. I would think about using something like this for small indie projects to output super low-poly base models, which I could then essentially use as a scaffold for my own finer adjustments. Saving time is better than generating high-poly masterpieces, for me at least.

dachworker

You can also use LLMs to write python code for FreeCAD. It kinda works, if you coach it through the whole process in baby steps.

rjsw

I think it would be better to train a LLM on and get it to generate a standard exchange format, COLLADA or gITF could be worth trying.

throwmeaway222

What I always wanted in video games where you can craft weapons was the idea that you can combine the duct tape with wood with fishing lures and it creates something the designer didn't think about :)

manc_lad

Zelda Breath of the Wild attempts to approach this with an interesting interface.

xnx

Tears of the Kingdom?

sfn42

That's not really possible. You need fish and fishing mechanics, you can't just create a game where fishing happens to be possible without anyone having thought about it.

You would basically need a life simulator for that, with rules like fish exist, fish live in water, fish chase other organic things in the water, a worm attracts fish, the hook makes the fish stick if it bites, do you see what I'm saying? You can't just have all this stuff happening organically. It's possible to have systems where the possibilities are so vast that the dev can't consider all of them, like spore or path of exile or whatever, lots of games with lots of options for customization. But you can't just have a real world simulator, the real world is too complex to simulate, let alone in real time.

throwmeaway222

yeah well, it's all coming

sfn42

Sure man, maybe in a few decades or centuries, who knows. I'll continue to concern myself with what's actually possible and not worry much about what doesn't seem possible.

aatd86

So soon enough, everyone will be able to vibecode game assets and players will be able to create their own character designs on-the-fly? Sweet although I also feel for designers as a profession.

nativeit

Don’t worry. Delusional optimists will eventually learn that not everything can be handwaved away with “it’ll get better,” especi when “it” is already the most expensive and most-funded software project in history, and still sucks in most contexts.

toddmorey

Designers, developers, producers—anyone involved in the supply chain.

DrBenCarson

https://zoo.dev/design-studio is better in ever conceivable way

NullCascade

Considering most SOTA LLMs are also multimodal/vision models, could they get better results if the LLM gets visual feedback with it?

ZiiS

Also surprising good at OpenSCAD if you keep reminding it not to assign shapes to variables.

Buttons840

What is that Python code? Is it a Python library or is this the Blender API?

zeograd

That's the blender API, supported by an embedded python interpreter coming with each Blender install

ranahanocka

Author here. AMA!

rapjr9

Could this produce a 3D model of a plastic case that will perfectly fit a PCB board? Could it be improved to also produce the CAD files for a PCB board?

GistNoesis

How would you go to make the llm generate object easier to 3d print (or manufacture)? Things like not using too much material, reducing the need for support, and maybe whether or not the produced part would be tough enough.

Are there some datasets where some LLM learn to interact with some slicers, some Finite Elements Analysis software, or even real world 3d-printed objects to allow the model to do some quality assessment ?

ranahanocka

Interesting idea. Our framework of having multiple different agents with different roles could play well with this. You could create an agent that checks for certain criteria, gives feedback to the coding agent. For example, you could build an agent that evaluates the toughness of the current 3D asset and suggests fixes. I like the idea of incorporating additional experts to solve different tasks!

dvdplm

I like the idea of llms collaborating like this a lot; planning, critiquing, verifying, coding etc. I think that’s a very general and powerful approach. How did you end up with that structure and what did you try first? What are the downsides? How do the component agents communicate, just json?

ranahanocka

The agents communicate through different paths. First, there's a "big boss" orchestrator that decides who speaks next. The outputs from all agents (including the code from the coding agent) is put into a shared context that each agent can draw from. Practically speaking, to make this happen we use AutoGen framework.

We slowly started building more and more agents. Everything we tried just worked (kinda amazing). We first started by trying to incorporate visual understanding via VLMs. Then we slowly added more and more agents, and the BlenderRAG gave a huge boost.

exasperaited

I am not a Blender user, but my strong suspicion is that the output of these things does not match what you'd be able to do in very short order by learning a bit of Blender.

One of my suspicions about these "can we make an LLM do something that isn't text?" projects is that underpinning it is something that isn't to do with AI at all.

Instead it's that a lot of specialist programmers really really loathe GUI paradigms for anything, consider text interfaces inherently superior, think the job of a GUI is only to simplify tasks and hide complexity, and so think all complex GUIs that are not immediately intuitive are categorical failures.

In rejecting learning GUI tools they rule out the possibility that GUI interfaces support paradigms text cannot, and they rule out the possibility that anyone who has deep skills in a particular GUI knows anything more than what all the switches and buttons do, when a Blender user is very evidently engaging in a similar kind of abstract thought as programming involved.

It is much the same with FreeCAD. Does the FreeCAD GUI still suck in several places? Yes. It was confusing and annoying until I learned that it is not trying to hide complexity. The inherent complexity is in the problem domain. But a programmer with a bit of maths and logic knowledge can, IMO, easily learn it. And then you discover that the FreeCAD UI is what it is because it is a set of tools designed by CAD-focussed programmers that attempts little to no magic, and suddenly you are using the tools to solve your own problems.

In short a lot of these projects have a whiff of "but I don't wannnnna learn a GUI". The LLM or AI generator offers a way to defer properly learning the tools or techniques that are not so difficult to learn, and so attracts a lot of attention.

ninetyninenine

I think architectures like this will be the key to agi. A constellation of specialized microservices of AI models, not all necessarily LLMs working in tandem to formulate output from input.

I think it's clear that the human brain is modular, and that at least one module in the human brain shares similarities to the LLM. So the key is really to build the other modules and interconnect everything.

mattigames

Something I don't understand is why is there such effort to crafting generators for all kinds of content but there isn't the same level of interest for crafting consumers of it? AIs that will play your games and watch your movies and tell you how good job "you" did? If is about money you don't have a chance because the monopolistic middlemen will hoard all space, e.g. Steam will generate the PC games that go popular, and whatever space is left will be quickly saturated by an overabundance of generated content, so given the scarcity of attention is only going up the next logical step is to generate the "people" that will consume the content, bots that craft reviews about what they "saw" and how they "felt" and have different "personalities" and constantly interact with each other.

sbarre

I think that kind of multi-modal work is ongoing but not as advanced as text-based LLMs are today.

All these kinds of generators are LLM-via-text-proxy in the sense that people are using LLM's excellent text generation properties to generate via scripting interfaces in various tools.

uoaei

It is remarkable that Transformer-based feedforward models can be appropriated for a whole lot, though at some point it should be considered a stretch to call such models "language models" when they are so heavily adapted to geometric systems.

WhereIsTheTruth

Who's working on the same thing for rigging/animating/uv?

I want a Blender plugin that assist me, I don't want to click to generate a model

29athrowaway

This could have been the POVRay or VRML reinassance.

But this generation is too young to remember those.

exe34

Can it do the pelican on a bicycle?

nabilss

[flagged]