In December 2024, during the frenzied adoption of LLM coding assistants, we became aware that such tools tended—unsurprisingly—to produce Go code in a style similar to the mass of Go code used during training, even when there were newer, better ways to express the same idea. Less obviously, the same tools often refused to use the newer ways even when directed to do so in general terms such as “always use the latest idioms of Go 1.25.” In some cases, even when explicitly told to use a feature, the model would deny that it existed. [...] To ensure that future models are trained on the latest idioms, we need to ensure that these idioms are reflected in the training data, which is to say the global corpus of open-source Go code.
robviren•Feb 17, 2026
I have run into that a lot which is annoying. Even though all the code compiles because go is backwards compatible it all looks so much different. Same issue for python but in that case the API changes lead to actual breakage. For this reason I find go to be fairly great for codegen as the stability of the language is hard to compete with and the standard lib a powerful enough tool to support many many use cases.
HumblyTossed•Feb 17, 2026
The use of LLMs will lead to homogeneous, middling code.
awesome_dude•Feb 17, 2026
I'm not sure if that's a criticism or praise - I mean, most people strive for readable code.
candiddevmike•Feb 17, 2026
LLM generated code reminds me of perl's "write-only" reputation.
awesome_dude•Feb 17, 2026
In all honesty I've only used LLMs in anger with Go, and come away (generally speaking) happy with what it produced.
coldtea•Feb 17, 2026
Does it really? Because I see some quite fine code. The problem is assumptions, or missing side effects when the code is used, or getting stuck in a bad approach "loop" - but not code quality per se.
cedws•Feb 17, 2026
It does. I’ve been writing Go for long enough, and the code that LLMs output is pretty average. It’s what I would expect a mid level engineer to produce. I still write code manually for stuff I care about or where code structure matters.
Maybe the best way is to do the scaffolding yourself and use LLMs to fill the blanks. That may lead to better structured code, but it doesn’t resolve the problem described above where it generates suboptimal or outdated code.
Code is a form of communication and I think good code requires an understanding of how to communicate ideas clearly. LLMs have no concept of that, it’s just gluing tokens together. They litter code with useless comments while leaving the parts that need them most without.
saghm•Feb 17, 2026
You might even say that LLMs are not capable of understanding a brilliant language but we want to use them to build good software. So, the language that we give them has to be easy for them to understand and easy to adopt.
bee_rider•Feb 17, 2026
Do LLMs generate code similar to middling code of a given domain? Why not generate in a perfect language used only by cool and very handsome people, like Fortran, and then translate it to once the important stuff is done?
pklausler•Feb 17, 2026
This might work if Fortran were portable, or if only one compiler were targeted.
munk-a•Feb 17, 2026
Middling code should not exist. Boilerplate code should not exist. For some reason we're suddenly accepting code-gen as SOP instead of building a layer of abstraction on top of the too-onerous layer we're currently building at. Prior generations of software development would see a too-onerous layer and build tools to abstract to a higher level, this generation seems stuck in an idea that we just need tooling to generate all that junk but can continue to work at this level.
kimixa•Feb 17, 2026
LLMs have always been great at generating code that doesn't really mean anything - no architectural decisions, the same for "any" program. But only rarely does one see questions why we're needing to generating "meaningless" code in the first place.
munk-a•Feb 17, 2026
This gets to one of my core fears around the last few years of software development. A lot of companies right now are saddling their codebases with pages and pages of code that does what they need it to do but of which they have no comprehension.
For a long time my motto around software development has been "optimize for maintainability" and I'm quite concerned that in a few years this habit is going to hit us like a truck in the same way the off-shoring craze did - a bunch of companies will start slowly dying off as their feature velocity slows to a crawl and a lot of products that were useful will be lost. It's not my problem, I know, but it's quite concerning.
nobleach•Feb 17, 2026
But Go culture promulgates this practice of repeating boilerplate. In fact this is one of the biggest confusion points of new gophers. "I want to do a thing that seems common enough, what library are you all using to do X?". Everyone scoffs, pushes up their glasses and says, "well actually, you should just use the standard library, it's always worked just fine for me". And the new gopher is confused because they really believe that reinventing the wheel is an acceptable practice. This is what leads to using LLMs to write all that code (admittedly, it's a fine use of an LLM).
meowface•Feb 17, 2026
For a few years, yeah. Eventually it will probably lead to the average quality of code being considerably higher than it was pre-LLMs.
shoo•Feb 17, 2026
middling code, delivered within a tolerable time frame, budget, without taking excessive risk, is good enough for many real-world commercial software projects. homogeneous middling code, written by humans or extruded by machines, is arguably even a positive for the organisation: lots of organisations are more interested in delivery of software projects being predictable, or having a high bus-factor due to the fungibility of the folks (or machines) building and maintaining the code, rather than depending upon excellence.
munk-a•Feb 17, 2026
PHP went through a similar effort a while back to just clear out places like Stackoverflow of terrible out of date advice (e.g. posts advocating magic_quotes). LLMs make this a slightly different problem because, for the most part, once the bad advice is in the model it's never going away. In theory there's an easier to test surface around how good the advice it's giving is but trying to figure out how it got to that conclusion and correct it for any future models is arcane. It's unlikely that model trainers will submit their RC models to various communities to make sure it isn't lying about those specific topics so everything needs to happen in preparation of the next generation and relying on the hope that you've identified the bad source it originally trained on and that the model will actually prioritize training on that same, now corrected, source.
miki123211•Feb 17, 2026
This is one area where reinforcement learning can help.
The way you should think of RL (both RLVR and RLHF) is the "elicitation hypothesis[1]." In pretraining, models learn their capabilities by consuming large amounts of web text. Those capabilities include producing both low and high quality outputs (as both low and high quality outputs are present in their pretraining corpora). In post training, RL doesn't teach them new skills (see E.G. the "Limits of RLVR"[2] paper). Instead, it "teaches" the models to produce the more desirable, higher-quality outputs, while suppressing the undesirable, low-quality ones.
I'm pretty sure you could design an RL task that specifically teaches models to use modern idioms, either as an explicit dataset of chosen/rejected completions (where the chosen is the new way and the rejected is the old), or as a verifiable task where the reward goes down as the number of linter errors goes up.
I wouldn't be surprised if frontier labs have datasets for this for some of the major languages and packages.
I believe you absolutely could... as the model owner. The question is whether Go project owners can convince all the model trainers to invest in RL to fix their models and the follow up question is whether the single maintainer of some critical but obscure open source project could also convince the model trainers to commit to RL when they realize the model is horribly mistrained.
In Stackoverflow data is trivial to edit and the org (previously, at least) was open to requests from maintainers to update accepted answers to provide more correct information. Editing is trivial and cheap to carry out for a database - for a model editing is possible (less easy but do-able), expensive and a potential risk to the model owner.
BiraIgnacio•Feb 17, 2026
I definitely see that with C++ code
Not so easy to "fix", though. Or so I think. But I do hope still, as more and more "modern" C++ code gets published
Groxx•Feb 17, 2026
They're particularly bad about concurrent go code, in my experience - it's almost always tutorial-like stuff, over-simplified and missing error and edge case handling to the point that it's downright dangerous to use... but it routinely slips past review because it seems simple and simple is correct, right? Go concurrency is so easy!
And then you point out issues in a review, so the author feeds it back into an LLM, and code that looks like it handles that case gets added... while also introducing a subtle data race and a rare deadlock.
Very nearly every single time. On all models.
Jyaif•Feb 17, 2026
> a subtle data race and a rare deadlock
That's a langage problem that humans face as well, which golang could stop having (see C++'s Thread Safety annotations).
awesome_dude•Feb 17, 2026
You should be using rust... mm kay :\
kbolino•Feb 17, 2026
Go has a pretty good race detector already, and all it (usually) takes to enable it is passing the -race flag to go build/test/run/etc.
brightball•Feb 17, 2026
Good use case for Elixir. Apparently it performs best across all programming languages with LLM completions and its concurrency model is ideal too.
Claude 4.6 has been excellent with Go, and truly incompetent with Elixir, to the point where I would have serious concerns about choosing Elixir for a new project.
hbogert•Feb 17, 2026
Shouldn't you have concerns picking Claude 4.6 for your next project if it produces subpar elixer code? Cheapy shot perhaps, but I have a feeling exotic languages will remain more exotic longer now that LLM aided development is becoming the norm.
dakolli•Feb 17, 2026
I'd prefer we start nuking the idea of using LLMs to write code, not help it get better. Why don't you people listen to Rob Pike, this technology is not good for us. Its a stain on software and the world in general, but I get it most of ya'll yearn for slop. The masses yearn for slop.
throw432196•Feb 17, 2026
I totally agree. I read threads like this and I just can’t believe people are wasting their time with LLM’s.
whattheheckheck•Feb 17, 2026
The masses yearn to not have to fiddle with bs for rent and food
kiernanmcgowan•Feb 17, 2026
Its tooling like this that really makes golang an excellent language to work with. I had missed that rangeint addition to the language but with go fix I'll just get that improvement for free!
Real kudos to the golang team.
jjice•Feb 17, 2026
There have been many situations where I'd rather use another language, but Go's tooling is so good that I still end up writing it in Go. So hard to beat the build in testing, linting, and incredible compilation.
iamcalledrob•Feb 17, 2026
Absolutely.
The Go team has built such trust with backwards compatibility that improvements like this are exciting, rather than anxiety-inducing.
Compare that with other ecosystems, where APIs are constantly shifting, and everything seems to be @Deprecated or @Experimental.
lowmagnet•Feb 17, 2026
I just searched for `for` loops with `:=` within and hand-fixed them. I found a few forms of the for loops and where there was a high number, I used regexp.
This tool is way cooler, post-redesign.
retrodaredevil•Feb 17, 2026
I think tooling that can modify your source code to make it more modern is really cool stuff. OpenRewrite comes to mind for Java, but nothing comes to the top of my mind for other languages. And heck, I into recently learned about OpenRewrite and I've been writing Java for a long time.
Even though I don't like Go, I acknowledge that tooling like this built right into the language is a huge deal for language popularity and maturity. Other languages just aren't this opinionated about build tools, testing frameworks, etc.
I suspect that as newer languages emerge over the years, they'll take notes from Go and how well it integrates stuff like this.
pjmlp•Feb 17, 2026
Java and .NET IDEs have had this capabilities for years now, even when Eclipse was the most used one there were the tips from Checkstyle, and other similar plugins.
Also IDE tooling for C#, Java, and many other languages; JetBrains' IDEs can do massive refactorings and code fixes across millions of lines of code (I use them all the time), including automatically upgrading your code to new language features. The sibling comment is slightly "wrong" — they've been available for decades, not mere years.
These can be applied across the whole project with one command, rewriting however many problems there are.
Also JetBrains has "structural search and replace" which takes language syntax into account, it works on a higher level than just text like what you'd see in text editors and pseudo-IDEs (like vscode):
For modern .NET you have Roslyn analyzers built in to the C# compiler which often have associated code fixes, but they can only be driven from the IDE AFAIK. Here's a tutorial on writing one:
> but nothing comes to the top of my mind for other languages
"cargo clippy --fix" for Rust, essentially integrated with its linter. It doesn't fix all lints, however.
loevborg•Feb 17, 2026
Does anyone have experience transforming a typescript codebase this way? Typescript's LSP server is not powerful enough and doesn't support basic things like removing a positional argument from a function (and all call sites).
Would jscodeshift work for this? Maybe in conjunction with claude?
nitnelave•Feb 17, 2026
Rust has clippy nagging you with a bunch of modernity fixes, and sometimes it can autofix them. I learned about a lot of small new features that make the code cleaner through clippy.
Arifcodes•Feb 17, 2026
The self-service analysis tools angle is the most underrated part of this. Being able to write custom fixers scoped to your own codebase solves a real pain point.
We maintain a large Go monorepo and every internal API migration turns into a grep+sed adventure. Half the time someone misses edge cases, the other half the sed pattern breaks on multiline. Having an AST-aware rewriter that understands Go's type system is a massive upgrade over regex hacks.
The -diff preview flag also makes this practical for CI. Run go fix -diff, fail if output is non-empty. That alone could replace a bunch of custom linters people maintain.
1-more•Feb 17, 2026
We have this with our frontend code through elm-review. There are a great many rules for it with fixes, and we write some specifically for our app too. They then run pre-push so you get feedback early that you need to fix things.
The real key: there's no ignore comment as with other linters. The most you can do is run a suppress command so that every file gets its current number of violations of each rule recorded in JSON, and then you can only ever decrease those numbers.
nzoschke•Feb 17, 2026
Go and its long established conventions and tools continues to be a massive boon to my agentic coding.
We have `go run main.go` as the convention to boot every apps dev environment, with support for multiple work trees, central config management, a pre-migrated database and more. Makes it easy and fast to dev and test many versions of an app at once.
5 Comments
In December 2024, during the frenzied adoption of LLM coding assistants, we became aware that such tools tended—unsurprisingly—to produce Go code in a style similar to the mass of Go code used during training, even when there were newer, better ways to express the same idea. Less obviously, the same tools often refused to use the newer ways even when directed to do so in general terms such as “always use the latest idioms of Go 1.25.” In some cases, even when explicitly told to use a feature, the model would deny that it existed. [...] To ensure that future models are trained on the latest idioms, we need to ensure that these idioms are reflected in the training data, which is to say the global corpus of open-source Go code.
Maybe the best way is to do the scaffolding yourself and use LLMs to fill the blanks. That may lead to better structured code, but it doesn’t resolve the problem described above where it generates suboptimal or outdated code. Code is a form of communication and I think good code requires an understanding of how to communicate ideas clearly. LLMs have no concept of that, it’s just gluing tokens together. They litter code with useless comments while leaving the parts that need them most without.
For a long time my motto around software development has been "optimize for maintainability" and I'm quite concerned that in a few years this habit is going to hit us like a truck in the same way the off-shoring craze did - a bunch of companies will start slowly dying off as their feature velocity slows to a crawl and a lot of products that were useful will be lost. It's not my problem, I know, but it's quite concerning.
The way you should think of RL (both RLVR and RLHF) is the "elicitation hypothesis[1]." In pretraining, models learn their capabilities by consuming large amounts of web text. Those capabilities include producing both low and high quality outputs (as both low and high quality outputs are present in their pretraining corpora). In post training, RL doesn't teach them new skills (see E.G. the "Limits of RLVR"[2] paper). Instead, it "teaches" the models to produce the more desirable, higher-quality outputs, while suppressing the undesirable, low-quality ones.
I'm pretty sure you could design an RL task that specifically teaches models to use modern idioms, either as an explicit dataset of chosen/rejected completions (where the chosen is the new way and the rejected is the old), or as a verifiable task where the reward goes down as the number of linter errors goes up.
I wouldn't be surprised if frontier labs have datasets for this for some of the major languages and packages.
[1] https://www.interconnects.ai/p/elicitation-theory-of-post-tr...
[2] https://limit-of-rlvr.github.io
In Stackoverflow data is trivial to edit and the org (previously, at least) was open to requests from maintainers to update accepted answers to provide more correct information. Editing is trivial and cheap to carry out for a database - for a model editing is possible (less easy but do-able), expensive and a potential risk to the model owner.
And then you point out issues in a review, so the author feeds it back into an LLM, and code that looks like it handles that case gets added... while also introducing a subtle data race and a rare deadlock.
Very nearly every single time. On all models.
That's a langage problem that humans face as well, which golang could stop having (see C++'s Thread Safety annotations).
https://autocodebench.github.io/
Claude 4.6 has been excellent with Go, and truly incompetent with Elixir, to the point where I would have serious concerns about choosing Elixir for a new project.
Real kudos to the golang team.
The Go team has built such trust with backwards compatibility that improvements like this are exciting, rather than anxiety-inducing.
Compare that with other ecosystems, where APIs are constantly shifting, and everything seems to be @Deprecated or @Experimental.
This tool is way cooler, post-redesign.
Even though I don't like Go, I acknowledge that tooling like this built right into the language is a huge deal for language popularity and maturity. Other languages just aren't this opinionated about build tools, testing frameworks, etc.
I suspect that as newer languages emerge over the years, they'll take notes from Go and how well it integrates stuff like this.
https://lwn.net/Articles/315686
Also IDE tooling for C#, Java, and many other languages; JetBrains' IDEs can do massive refactorings and code fixes across millions of lines of code (I use them all the time), including automatically upgrading your code to new language features. The sibling comment is slightly "wrong" — they've been available for decades, not mere years.
Here's a random example:
https://www.jetbrains.com/help/rider/ConvertToPrimaryConstru...
These can be applied across the whole project with one command, rewriting however many problems there are.
Also JetBrains has "structural search and replace" which takes language syntax into account, it works on a higher level than just text like what you'd see in text editors and pseudo-IDEs (like vscode):
https://www.jetbrains.com/help/idea/structural-search-and-re...
https://www.jetbrains.com/help/idea/tutorial-work-with-struc...
For modern .NET you have Roslyn analyzers built in to the C# compiler which often have associated code fixes, but they can only be driven from the IDE AFAIK. Here's a tutorial on writing one:
https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/t...
"cargo clippy --fix" for Rust, essentially integrated with its linter. It doesn't fix all lints, however.
Would jscodeshift work for this? Maybe in conjunction with claude?
We maintain a large Go monorepo and every internal API migration turns into a grep+sed adventure. Half the time someone misses edge cases, the other half the sed pattern breaks on multiline. Having an AST-aware rewriter that understands Go's type system is a massive upgrade over regex hacks.
The -diff preview flag also makes this practical for CI. Run go fix -diff, fail if output is non-empty. That alone could replace a bunch of custom linters people maintain.
The real key: there's no ignore comment as with other linters. The most you can do is run a suppress command so that every file gets its current number of violations of each rule recorded in JSON, and then you can only ever decrease those numbers.
We have `go run main.go` as the convention to boot every apps dev environment, with support for multiple work trees, central config management, a pre-migrated database and more. Makes it easy and fast to dev and test many versions of an app at once.
See https://github.com/housecat-inc/cheetah for the shared tool for this.
Then of course `go generate`, `go build`, `go test` and `go vet` are always part of the fast dev and test loop. Excited to add `go fix` into the mix.