Hacker News new | past | comments | ask | show | jobs | submit
The microcode is in a ROM. It's a regular structure where a 1 looks different to a 0.
Yes, literally this. No verilog decode, just looking for signals in the image of a 1 vs. a 0. For example, a 1 may be the existence of a transistor at a particular intersection of wiring.
Right. And the best way to think about microcode is as code for a wacky, custom VLIW processor that implements the programmer-level x86 (in this case) instruction set. Various fields in the microcode send signals to different parts of the processor to activate them, routing values along internal busses and between registers, functional units and memory to cause the processor to execute the x86 instructions.
So what you actually need is a program that navigates through the huge image of the die and detects if the structure that is looking at is a 1 or a 0? This at the fundamental level is a cross between machine learning and image processing?
loading story #48249324
Yes, exactly. Historically you would make some simple image processing software that will align the grid and then look for properties at each specific bit position. Usually die shots are highly imperfect (the delayering usually leaves some artifacts or damage) so frequently merging multiple scans is important as well. Travis Goodspeed has a neat tool for this workflow at https://github.com/travisgoodspeed/maskromtool and the blog mentions John McMaster’s bitract: https://github.com/SiliconAnalysis/bitract although I think most people working on these projects usually just one-off it as the mentioned Discord users in the blog post eventually did.

More modern devices are of course more difficult due to layers, feature size, and less visually obvious ROM bit designs.

Anyway, the impressive part of this project was really understanding the undocumented microcode assembly language through inference and trace following; the 1s and 0s look like they were the easy part!

The full workflow seems to look something like this, with the added complications relative to the 8086 microcode being that the 80386 microcode acts as an orchestration layer on top of hardwired engines, programmable logic arrays, and fault/protection redirection. The 8086 microcode does all that algorithmically, reusing the same hardware instead of having dedicated transistors.

1. Extract the ROM bits. 2. Determine physical-to-logical bit ordering. 3. Identify microinstruction boundaries. 4. Infer field boundaries. 5. Associate fields with hardware destinations (check with die tracing). 6. Decode instruction-dispatch programmable logic arrays. 7. Associate x86 instructions with microcode entry points. 8. Infer repeated idioms: moves, ALU ops, termination, calls, tests, redirects. 9. Decode accelerator protocols. 10. Validate against known architectural behavior.