175 pointsby nand2marioMay 23, 2026

9 Comments

mettamageMay 23, 2026
For me, this is peak Hacker News. I am happy I took the hard courses at uni to understand a post like this. I’m also happy that HN was there to stimulate this thinking at the time (2015). Even if I now don’t really do anything with my humble knowledge of low level programming, every time it feels consciousnesses enriching. And it’s an awesome feeling.

For people that don’t have access to a uni, I recommend nand2tetris.org

morphleMay 23, 2026
Just building your own microprocessor from gates is an easier way to learn about designing microcode and understanding how processors work(ed). But it can't hurt to study a few simple old designs like RISC or Transputer. The 80386 is on the other side of that spectrum, needlessly complicated because they wanted to be backwards compatible with an old bad design.

There certainly is no need to go to university to learn chip design. Watching a few Alan Kay talks [3] or browsing Bitsavers computer designs [4] are good starting points.

We made an easier way (than FPGA) to simulate and convert your gate level design into transistors on a chip (for less than $200 in 2026). We call it Morphle Logic [1].

Eventually you grow into making the largest fastest and cheapest supercomputer wafer scale integration [2].

[1] https://github.com/fiberhood/MorphleLogic/blob/main/README_M...

[2]https://www.youtube.com/watch?v=vbqKClBwFwI

[3] https://www.youtube.com/watch?v=f1605Zmwek8

[4] http://bitsavers.informatik.uni-stuttgart.de/pdf/xerox/alto/...

joleyjMay 23, 2026
> needlessly complicated because they wanted to be backwards compatible with an old bad design.

It's not really needless complication of there is a reason for the complication. Obvioudsly in this case the need to be backward compatible with an old design made the implemtation more complicated than if they didn't need to do that. There were very, very strong business reasons why backward compatibility was a design requirment.

deskamessMay 23, 2026
Do you know if nand2tetris covers/uses microcode?
drivers99May 23, 2026
It doesn’t. I posted a reply to the same comment before I saw your question. Even the books I mentioned didn’t really get into it. I tried a search for some that did and ran across Constructing a Microprogrammed Computer by O.J. Mengali which looks interesting. It says it has you implement the microcode for 4 different architectures. I’m going to check it out.
mettamageMay 23, 2026
Ah that's a shame. I had a computer systems course at uni where we were playing around with the microcode from the MIC-1 created by Tanenbaum. I sort of figured that Nand2Tetris just had that in it.
drivers99May 23, 2026
I did nand2tetris a couple times, but it emphasizes simplicity in every level of abstraction. That in itself is an amazing lesson and has been an inspiration, but that also means it skips things like microcode. In college (in the 1990s) I took a EE class as part of my CS degree that went through how an 8086-like[0] CPU is made, a lot like nand2tetris but without necessarily making each part an assignment. It did cover how microcode worked where there was an internal program counter that stepped through a table of control words whose bits directly orchestrated each controllable piece of the CPU. We each got an instruction to implement on a simulator that the teacher had made previously. (I got DEC, decrement.)

In a way I guess the instructions in nand2tetris are the microcode. The bits of the instructions directly control the hardware with the first bit choosing 2 instruction types, so there’s only 1 step of code per instruction, unlike with microcode where an instruction can have any number of microcode steps.

In Ben Eater’s series of videos building an 8-bit CPU on breadboards he has ROMs that are indexed by the opcode (4 bits of the instruction) + a step counter to determine the control word. The ROM stands in for what could be done with sufficiently complicated logic gates. I like it as a next step on the hardware side as you get hands on experience with electronics and having to troubleshoot it.

It’s disappointing how it only has 16 bytes of RAM so you can’t really build higher levels of abstraction like you can with nand2tetris. But at that point you could (I should) either redo it with a better design (and put it on PCBs) or move on to the 6502 project, and then since that puts together a timer, CPU, ROM, RAM, I/O, UART, etc. mentally group those together and move on to microcontrollers that already have them together.

Anyone interested in reading about how a CPU could be made out of logic gates could also read Code by Charles Petzold (moves slower, recently updated) and/or Pattern on the Stone by Danny Hillis (moves faster).

Edit: I just checked Code (2nd edition) and that uses a 4 bit cycle counter and hard logic gates to determine what to do each cycle. But then it uses an array of diodes for part of the logic. Would that be considered microcode?

[0] there were classes that covered more advanced (pipelined) CPUs in another CS class but not at quite a low level where you felt like you could make one yourself

bmenrighMay 23, 2026
The black box analysis needed to decode this is incredibly hard but also incredibly fun and rewarding to pull off. Very impressive work.
liendolucasMay 23, 2026
> ...they mentioned that it would be interesting to get high resolution images of the 80386 die and try to extract the microcode from it.

Can someone explain how is that from a high resolution image of the die the microcode can be reconstructed? I'm really curious, what's the process? Is the output some sort of Verilog? Does the process involve recognizing each and every transistor and model a circuit from that? I'm fascinated that something like this is possible at all...

dborehamMay 23, 2026
The microcode is in a ROM. It's a regular structure where a 1 looks different to a 0.
jdblairMay 23, 2026
Yes, literally this. No verilog decode, just looking for signals in the image of a 1 vs. a 0. For example, a 1 may be the existence of a transistor at a particular intersection of wiring.
liendolucasMay 23, 2026
So what you actually need is a program that navigates through the huge image of the die and detects if the structure that is looking at is a 1 or a 0? This at the fundamental level is a cross between machine learning and image processing?
bri3dMay 23, 2026
Yes, exactly. Historically you would make some simple image processing software that will align the grid and then look for properties at each specific bit position. Usually die shots are highly imperfect (the delayering usually leaves some artifacts or damage) so frequently merging multiple scans is important as well. Travis Goodspeed has a neat tool for this workflow at https://github.com/travisgoodspeed/maskromtool and the blog mentions John McMaster’s bitract: https://github.com/SiliconAnalysis/bitract although I think most people working on these projects usually just one-off it as the mentioned Discord users in the blog post eventually did.

More modern devices are of course more difficult due to layers, feature size, and less visually obvious ROM bit designs.

Anyway, the impressive part of this project was really understanding the undocumented microcode assembly language through inference and trace following; the 1s and 0s look like they were the easy part!

photochemsynMay 23, 2026
The full workflow seems to look something like this, with the added complications relative to the 8086 microcode being that the 80386 microcode acts as an orchestration layer on top of hardwired engines, programmable logic arrays, and fault/protection redirection. The 8086 microcode does all that algorithmically, reusing the same hardware instead of having dedicated transistors.

1. Extract the ROM bits. 2. Determine physical-to-logical bit ordering. 3. Identify microinstruction boundaries. 4. Infer field boundaries. 5. Associate fields with hardware destinations (check with die tracing). 6. Decode instruction-dispatch programmable logic arrays. 7. Associate x86 instructions with microcode entry points. 8. Infer repeated idioms: moves, ALU ops, termination, calls, tests, redirects. 9. Decode accelerator protocols. 10. Validate against known architectural behavior.

electrolyMay 23, 2026
I helped out on this image-to-bits transcription, doing manual verification of the automated work. I did the whole thing by hand: I sliced the ROM images into strips that excluded parts of the image that don't encode bits, used my tablet and stylus to manually place a black dot on every 1 bit, then wrote a trivial program that detected the presence or absence of the black dot in each cell. From my perspective, the ROM is organized like a series of "ladders" where the 1 bits are missing legs of the ladder, and I was placing dots on the missing legs. I compared my results with the ML output and manually re-checked each bit where we disagreed.

http://brianluft.com/images/2026/05/386_microcode_bits.jpg -- my fully annotated result. I was working from a higher-quality PNG; this is highly compressed because it's a big image.

drob518May 23, 2026
Right. And the best way to think about microcode is as code for a wacky, custom VLIW processor that implements the programmer-level x86 (in this case) instruction set. Various fields in the microcode send signals to different parts of the processor to activate them, routing values along internal busses and between registers, functional units and memory to cause the processor to execute the x86 instructions.
ddtaylorMay 23, 2026
Here's a video of some guys de layering the chips for the Nintendo 64 lockout mechanism. It's pretty in-depth and it goes over a lot of different ways they do this.

https://youtu.be/HwEdqAb2l50?si=VFLed64PZvpCHfy1

liendolucasMay 23, 2026
Thanks for sharing, will definitely watch it!
LevitatingMay 23, 2026
Just look at the images[1].

> The photo above shows part of the microcode ROM. Under a microscope, the contents of the microcode ROM are visible, and the bits can be read out, based on the presence or absence of transistors in each position.

[1]: https://www.righto.com/2020/06/a-look-at-die-of-8086-process...

trollbridgeMay 23, 2026
I checked reenigne's blog a few days ago. "Hmm, nothing posted since 2020. Oh well."

It's especially fun seeing his blog going back 33 years.

kgwxdMay 23, 2026
Maybe the hit counter increment was the inspiration for the post.
whentMay 23, 2026
Where's the hit counter? Mind pointing me to it. Can't find it anywhere at TFA.
ChrisClarkMay 23, 2026
He's making a joke. As in, "the site is so old, it probably still has a hit counter."
p1eskMay 23, 2026
Here’s a great book explaining microprogramming from ground up: https://www.amazon.com/Computation-Structures-Optical-Electr...

Easy to find a free pdf

yukIttEftMay 23, 2026
If you put this into an emulator, would it boot linux?
DweditMay 23, 2026
Meanwhile the original ARM didn't use any microcode at all.
danborn26May 23, 2026
This is an incredible piece of reverse engineering. Seeing the actual microcode implementation helps demystify how these older processors handled complex operations.
LevitatingMay 23, 2026
I wonder if an OpenFletcher[1] would be able to get such images

[1]: https://openflexure.org/projects/microscope/

kiddicoMay 23, 2026
I'm absolutely going to make one of those