The backend is closed source, but it runs all on AWS Lambda/DynamoDB/APIGateway and is written in Rust. Getting the compiler running in a Lambda was an adventure of it's own
rybosome•Jun 28, 2026
I’d be interested in hearing more detail on that. I’m actually surprised you were able to get the compiler, I assumed it would be expensive and proprietary.
Retr0id•Jun 28, 2026
I'd love it if there was some way to contribute to ongoing game decompilation projects, with a similarly streamlined web interface - it's something I'd be willing to dedicate some brain time to every so often, but setting up the toolchain etc. feels too much like work.
By the way, I was able to "cheat" on the second lesson with
That's what decomp.me is for, when I'm stuck on a function in my own projects I usually set it up on there and link it in the codebase so anyone can pick it up. Sometimes I like to browse the front page and hope I know enough to silently match somebody else's function (usually stays as a hope though...)
jackpriceburns•Jun 28, 2026
decomp.me is also a great tool! The playground section of the site allows you to turn the code into a decomp.me scratch.
I also use the objdiff wasm on the frontend for the assembly diffing. I don't see much point in reinventing the wheel and these tools are already great, so I'll deffo be leaning on them when I can
jackpriceburns•Jun 28, 2026
I was thinking of something similar as well, perhaps a section of this site after you've completed the course where we show functions from popular decomp projects that aren't 100% matched, and you can match it. Doing so will magic up a PR or something.. It's a great idea!
As for cheating, the community calls this a fake match. I don't check that the code you submit conforms to what I expect, I only check if the assembly matches. You can do interesting things where you do a series of bit shifts and bit masks, and you can replicate an equality operator `a == b` or a low clamp `x < 0 ? 0 : x`. I'm not sure if I'll lock this down or not, for people who have accounts, I can see their submissions so I think I'll play it by ear and see what happens. If it looks like people are constantly fake matching, I can look at tweaking the lessons or locking it down more
saturn8601•Jun 28, 2026
Damn this is next level. Congratulations on your achievements!
When Fable was around I thought i'd test it by taking an old piece of Windows software from the late 90s/2000s(ModPlug Player) and seeing how well it could convert it to being a native Mac application.
I was blown away at how it got 85% of the way there in one prompt. Things such as writing a PE extractor, recovering the complete skin, menu tree, full accelerator table, all dialogs, and then it delved into the registry value names as well. Some more prompts got it to 99%(I was happy with that and stopped)
I then took an old 1999 DOS demoscene and yet again it did wonderful magic and got me a native mac build.
I dropped everything I was doing and just started going through all these old apps that I couldn't easily enjoy since im on a Mac. It got to the point where I was losing sleep over it(was just so excited).
The fun ended when I was stopped mid-project with the Fable ban. Opus just does not compare and essentially killed all the enthusiasm after the nth failure of it to complete the task.
It made me realize that among the efforts of the RE community, and the emerging capabilities of these frontier models, in the future we could have the possibility living in a renaissance of open computing if we want any software we see on the market to be forever remixed and tailored to our uses and completely open.
I don't know how the business and legal side will deal with this. There needs to be new frameworks and ways of thinking about this stuff.
I'm just happy that hopefully no code will ever be lost to the sands of time ever again.
jackpriceburns•Jun 28, 2026
AI is being used in many retro game decomp projects!
One of the reasons I went down the path of learning decomp myself was because AI had hit a wall. Matching decomp is quite a bit harder than just normal decomp as even simple things like using an if/else instead of a terney actually change the assembly. AI did an amazing job of getting to 95% matches on nearly all functions, but once it got to that tail end, it started to struggle quite a lot and would often just claim "it's impossible". So that's when I pivoted and started learning actual decomp myself so that I could prompt AI better and finish off the star fox adventures decomp!
ducktective•Jun 28, 2026
Before the advent of LLMs, ML was used in upscaling the assets and pre-rendered backgrounds of the first 3 classic Resident Evil games: https://www.reshdp.com
jonhohle•Jun 28, 2026
I say this every time it comes up, but polluting a decomp project with AI generated code is risky, imho. What makes decomp legal (in the US) is that it’s a creative transformation performed by a human and the resulting copyright of the code that just happens to compile to the same binary is owned by the person doing the decompiling.
USPTO and court precedent is leaning heavily toward LLM output not being transformative on its own, making it mechanical, and no longer fair use and in violation of copyright. This puts a legal gray cloud on a project where most contributors couldn’t defend themselves if a rights holder goes after it, and there’s a high likelihood that they would succeed. On the other hand there’s enough case law protecting human decompilation that even the most litigious game companies don’t go after decomp projects that have historically been done by humans.
(I’m not a lawyer, I’m not your lawyer, this is not legal advice, etc., etc.)
asiekierka•Jun 28, 2026
Does it being a creative transformation rob the derivative work status? Personally, I'd liken the process of decompilation to that of translating a book from one language to another - the copyright on the original work does not become void merely because the process of translation requires extensive creativity.
Nicalis and Take-Two have both gone after decompilation projects, also. In particular, Nicalis has gone after a decompilation of Cave Story, but not a black box reimplementation of the same, while Take-Two ended up suing a decompilation developer (albeit settled out of court). However, in some jurisdictions, even clean reimplementations have failed - see Tetris v. Xio.
(I am not a lawyer either, etc etc, but that's my understanding)
jonhohle•Jun 28, 2026
The RE3 devs were distributing binaries. This is known to be an issue. The source code is theirs, binaries mixed with other copyrighted content is not. They also allegedly violated a EULA, but I haven’t looked closely into that.
CSE2 was distributing binaries as well.
So was SM64 decomp and Nintendo told them to stop, they did and continued to share their source code.
Tetris v. Xio is unrelated to reverse engineering or decompilation.
paavohtl•Jun 28, 2026
> The source code is theirs, binaries mixed with other copyrighted content is not.
Distributing binaries should not matter. If the binary is just compiled from the source code, the binary is just an (non-)infringing as the source code.
> They also allegedly violated a EULA
Meaningless. EULAs are not the law.
charcircuit•Jun 28, 2026
These decomp projects are already violating copyright by distributing the decompiled source code. Using LLMs is less risky than sharing the code.
koala_man•Jun 28, 2026
Matching decomp would require the same compiler and flags as the original game, right? How is that determined?
StilesCrisis•Jun 28, 2026
Experimentally, I think. There's only a few dozen options, and you can winnow it down to just a few pretty quickly.
ducktective•Jun 28, 2026
Makes one wonder, why should anyone embark on learning this intricate and time-consuming art of reverse engineering when LLMs are on their way to automate it in seconds...
hsuduebc2•Jun 28, 2026
You can tell this basically about anything software related these days. Yet when human is in the loop, the insight is still needed.
nosioptar•Jun 28, 2026
This is cool as hell.
On the first lesson, it tells me there's a target on "the right". There isn't anything to the right, I've in clue where to look.
jackpriceburns•Jun 28, 2026
Are you on mobile? You'll need to switch to the code/review tab to see. I think mobile support is a bit funky, I'll look at fixing that as soon as I can!
nosioptar•Jun 28, 2026
I switched to desktop view, still couldn't tell where I was supposed to be looking.
OsrsNeedsf2P•Jun 28, 2026
Dumb question about reverse engineering binaries: is there a way to only do it piecemeal? I'm eventually waiting for LLMs and harnesses to get good enough to reverse engineer BFME (old Lord of the Rings game that still has an active modding community), but it's a multi GB sized game that would have to be done in bite-sized pieces.
Basically; can you reverse engineer in bite sized pieces, and recompile/customize their behavior, without needing to do it all at once?
jonhohle•Jun 28, 2026
Most decomp projects (that I know of) are Ship of Theseus style projects where the minimum unit is a function, give or take alignment requirements and quirks of the compiler. On the MIPS side, tools like Splat and SPIM can help identify function and even source file boundaries, generate inline ASM C files[0], and write linker scripts to build a matching binary. You can then go through and replace the ASM functions one at a time until you just have C left.
Interesting when you mention Ship of Theseus, I never thought of that but I wonder if that is where the name “Ship of Harkinian” comes from?
jevndev•Jun 28, 2026
It has a double meaning actually! The ship of Theseus reference like you noted and the “Harkinian” part being the name of the king of Hyrule from the CDI games. One of his lines is “Enough! My ship sails in the morning” [0] so the project is also a reference to his actual ship. (Referenced in the projects FAQ [1])
Have you tried? I've haven't tried anything huge but I've had LLMs decompile SNES ROMs for me.
paavohtl•Jun 28, 2026
Yes, quite easily. It requires some setup, but the basic idea is that you create a DLL and a simple loader program which injects it into your target process. You can then use a hooking library like MinHook to replace individual functions with your own implementations. If the target application is in C++, you can additionally do vtable hooking and replace functions even easier (though it will always be a combination of the two techniques).
not_a9•Jun 28, 2026
There’s also fun stuff like VEH hooking and SLAT hooking, though SLAT hooks are not very useful in this case.
Retr0id•Jun 28, 2026
Most of those GB are probably data rather than executable code, it might not be quite as bad as you're imagining.
sciencejerk•Jun 28, 2026
I recently heard that Super Mario 64 (N64) modding community reverse engineered the game enough to recreate more-or-less accurate C code that can be compiled in binaries to execute on many popular target architectures.
Have you managed to get beloved games into modifiable C code? Or is it more common to invest a lot of work to document assembly language functions? I know some old assembly but no idea what is involved at a high level. Maybe you explain in your lessons?
Also, how to folks obtain binaries? Presumably unless there is a source code breach or vulnerability, source never gets exposed, is thst correct?
xahrepap•Jun 28, 2026
Mario 64 was byte for byte decompiled to C. It was helped by using the Debug symbols accidentally(?) compiled into the final version of the game.
Otherwise they reference rips of the original game.
Randomno•Jun 28, 2026
> It was helped by using the Debug symbols accidentally(?) compiled into the final version of the game.
Don't think this was the case, what helped was it being compiled without optimizations.
jackpriceburns•Jun 28, 2026
Most games will have been written in a higher level language first (like C or C++) and then compiled into assembly. With matching decomp, we write C, compile it to assembly, and see how much it matches the retail assembly. Using this we can write C that we theorise is almost identical to how it would have been written originally.
There are things lost during the compilation process (like comments, function names, etc) these we have to name manually and it's a long process to do! But yes, the goal is to have C at the end and once you have C you can recompile using a different compiler and target any architecture you like
teaearlgraycold•Jun 28, 2026
I suspect LLMs would do a good first pass with function naming.
jackpriceburns•Jun 28, 2026
They do an alright job, but they deffo need human-in-the-loop for the best results. I wrote on my blog about how I wrote an MCP to communicate with the Dolphin emulator. Allows Claude Code to set breakpoints, read the memory, write the memory, etc. It was a super fun way to work with Claude and has made naming functions/structs/fields/etc much easier! https://jpb.dev/blog/dolphin-debugger-mcp/
bottlepalm•Jun 28, 2026
I like decomp, but it makes me nervous. Like how safe is it to decompile a game and publish it to like github with all the symbols, addresses, etc..
charcircuit•Jun 28, 2026
The risk is low assuming you respect a potential take down notice that comes in.
dezgeg•Jun 28, 2026
Quite safe in practice, even Nintendo games have had no issues. GTA 3 / Vice City decompilers did get sued though, but IIRC mainly because they did not comply with DMCA requests at all.
soxfox42•Jun 28, 2026
Seems like a cool idea, but I can't even complete the first task. The compiler service seems to be broken, since in both lessons and the playground I just get "Could not write source: No space left on device (os error 28)".
jackpriceburns•Jun 28, 2026
Just deploying a fix for this right now! Didn't expect this to be so popular haha
oneshtein•Jun 28, 2026
Dumb question — can LLM be used to reverse-engineer firmware blobs or binary only drivers for Linux, to create open-source drivers, for example, for unsupported smartphones?
LelouBil•Jun 28, 2026
That is already happening for old games, and while they usually run on simpler CPUs than modern ones, I don't see why this couldn't be possible for binary Linux drivers.
The toolchain would also be easier to match, unless they were using some proprietary compiler you can't get your hands on.
Just lookup how they match the toolchain, and find an agent harness to do decompilation.
I wonder if doing this kind of stuff with more recent software will cause more legal problems though. I am not really sure of the legal status of the resulting code.
tudelo•Jun 28, 2026
Most likely, this (reverse engineering) is one of the numerous things these LLM companies target. You can also assume all of the internet has been slurped up in to any frontier model. That doesn't mean what you want will be a one shot prompt though...
realusername•Jun 28, 2026
As I'm working in this area, I can say that the biggest problem isn't binary only drivers but spotty mainline support of core features (provided that we're talking about Android phones, not iPhones or old Windows mobile stuff)
RossBencina•Jun 28, 2026
From where you're standing, what does the path to improving mainline support of core features look like? Aside from sheer volume of work are there other major challenges or opportunities? In the past I've hacked around adding stuff to Armbian kernels, but navigating a path to mainline seemed like a whole other job.
realusername•Jun 28, 2026
It really depends of your device but some SOC have very poor mainline support and are essentially living in a forked repo with nothing sent to lkml, others have partial support in mainline but with some of the core features being broken.
Drivers can also be partially implemented / buggy depending on the device. (And I'm not talking about closed source stuff, even things like usb or touch screen drivers).
I would say that the major blocker is that nobody really cares about these devices and the task itself to mainline them is gigantic.
Most SOC have 3 or maybe 4 contributors maximum and the phonedev kernel mailing list gets a new patch only every 2 days.
The whole phone mainlining community is probably less than 60 people.
supaflybanzai•Jun 28, 2026
Is there something similar to learn ARM decompiling?
jackpriceburns•Jun 28, 2026
I'm not aware of anything that's similar to this for ARM.. but in the future there is certainly room to add this. For now I'm focusing on getting the current set of lessons up to scratch as a lot of them need improvement. Then I think I want to add C++ as a language and allow you to change which version of MWCC you can run
dataflow•Jun 28, 2026
> If even 1 instruction or bit is off, that's a fail.
Does this assume having access to the exact version of the compiler use, or can it be done with a different compiler in practice?
And do you care about things like binary layout, or just instruction match? (Does that ever matter in practice?)
dezgeg•Jun 28, 2026
Yes, exact version of the original compiler is required.
Generally bit-for-bit equivalence to the original executable is expected. However I think for some cases where the original executable included debug info (eg. PS2 ELFs) then the unused-at-runtime sections need not match.
HiPhish•Jun 28, 2026
Love the idea! Assembly has been on my forever-list to eventually learn. I have worked through the warmup exercises, but I doubt I'll have the time to continue much past that. A few points of note:
- Not a fan of the purple theme, it screams "AI-generated". It's not a deal breaker, you can keep it if you have more important concerns, but just something to point out
- It would be nice to have a "Chapter 0" for a primer on assembly syntax. Does not have to be interactive, a few toy examples I can work out on paper would be good enough
- Maybe I just haven't seen it, but it would be nice to have a reference of all the various instructions. Your lessens explain them well enough, but I would like to have a list of all of them at a glance so I can look up instructions from earlier chapters later.
allan_s•Jun 28, 2026
Is "the layout scream AI generated" the new Bootstrap ? I.e what people use to quickly convey their ideas (bad or good)
Ironically, for myself at least, the least it looks AI, the more I've spent time prompting for it to not look AI.
Klonoar•Jun 28, 2026
The purple theme seems fine and appropriate to me, given the GameCube’s color scheme - and some of the site feels almost Slippi-esque so there’s a bit of that angle too.
jackpriceburns•Jun 28, 2026
- The purple theme was a conscious choice, not an AI's decision. I wanted to match the colour theme of the GameCube. You can see the current colour pallete here: https://decomp-academy.dev/brand/brand-sheet this will likely go through changes in the near future but I'd like to stick to the purple theme as a nod to the GameCube
- I've just added some new lessons before the first lesson to explain assembly syntax!
- I believe there are around 200-odd instructions, and a lot of them I'm not sure you'll ever see, but I can certainly look at adding something like this
eunos•Jun 28, 2026
Feasible to have light theme?
jackpriceburns•Jun 28, 2026
Will look at adding this!
StilesCrisis•Jun 28, 2026
This was interesting, but actually trying to contribute to decomp.me was still really hard! I found a lot of code that seemed perfect except for instructions slightly out of order, or dead pop statements after the logical end of the function. I wasn't able to actually fix anything :(
Also, I wish there were a guide about how to start from nothing on a new GC game. That's more interesting to me than putting the finishing polish on a decomp project that already "works" functionally.
jackpriceburns•Jun 28, 2026
There will deffo be a lesson on how to setup a decomp project from scratch! This is far from finished, there's opportunity to add so much more. I'd argue that I'd rather teach the user the basics of matching first before diving into setting a project, as setting up the splits.txt or symbols.txt might be quite a leap for a beginner. Feel free to keep checking in once a week and the lesson will appear :)
bpavuk•Jun 28, 2026
I'd enjoy having something like this but for x86/64 :)
jackpriceburns•Jun 28, 2026
The lessons taught in this can be applied to x86/64, but can deffo look adding different instruction sets in the future
__alexander•Jun 28, 2026
Hi OP, I would be careful if you profit from this project in anyway. Nintendo is not a company that takes reverse engineering of their hardware or games lightly. You will probably hear from their lawyers next week :(
jackpriceburns•Jun 28, 2026
No plans to profit at all from this!
rowanG077•Jun 28, 2026
That's awesome! I have been playing with the idea of doing a Tales of Symphonia decomp. More than 10 years ago I started the high-res texture pack, which the community has now carried further than I ever did. But it would be totally awesome to be able to mod the game further and run it on other consoles without the godawful framerate downgrade.
not_a9•Jun 28, 2026
Interesting exercise. Why does 1/1 matching to original asm matter, though? Maintaining same timing as original game?
Philpax•Jun 28, 2026
Being absolutely sure that your code behaves the same as the original game, so that you have a strong baseline to go from.
StilesCrisis•Jun 28, 2026
Having a baseline "this builds XYZ exactly as it shipped" is just the strongest possible guarantee that it's a fully accurate decomp with no surprises or bugs. Obviously you can have an interesting, useful decomp that works either way but it's much harder to prove that it's perfectly faithful.
nativeforks•Jun 28, 2026
The browser-first approach is a bigger deal than it sounds. Every time I've looked at reverse engineering, I got stuck somewhere between "install this ancient compiler" and "patch this SDK". Being able to just open a tab and start experimenting removes a huge amount of friction.
19 Comments
By the way, I was able to "cheat" on the second lesson with
I gave up at https://decomp-academy.dev/lesson/workflow-what-matching-mea... when I was presented with a wall of LLM-flavoured textAs for cheating, the community calls this a fake match. I don't check that the code you submit conforms to what I expect, I only check if the assembly matches. You can do interesting things where you do a series of bit shifts and bit masks, and you can replicate an equality operator `a == b` or a low clamp `x < 0 ? 0 : x`. I'm not sure if I'll lock this down or not, for people who have accounts, I can see their submissions so I think I'll play it by ear and see what happens. If it looks like people are constantly fake matching, I can look at tweaking the lessons or locking it down more
When Fable was around I thought i'd test it by taking an old piece of Windows software from the late 90s/2000s(ModPlug Player) and seeing how well it could convert it to being a native Mac application.
I was blown away at how it got 85% of the way there in one prompt. Things such as writing a PE extractor, recovering the complete skin, menu tree, full accelerator table, all dialogs, and then it delved into the registry value names as well. Some more prompts got it to 99%(I was happy with that and stopped)
I then took an old 1999 DOS demoscene and yet again it did wonderful magic and got me a native mac build.
I dropped everything I was doing and just started going through all these old apps that I couldn't easily enjoy since im on a Mac. It got to the point where I was losing sleep over it(was just so excited).
The fun ended when I was stopped mid-project with the Fable ban. Opus just does not compare and essentially killed all the enthusiasm after the nth failure of it to complete the task.
It made me realize that among the efforts of the RE community, and the emerging capabilities of these frontier models, in the future we could have the possibility living in a renaissance of open computing if we want any software we see on the market to be forever remixed and tailored to our uses and completely open.
I don't know how the business and legal side will deal with this. There needs to be new frameworks and ways of thinking about this stuff.
I'm just happy that hopefully no code will ever be lost to the sands of time ever again.
One of the reasons I went down the path of learning decomp myself was because AI had hit a wall. Matching decomp is quite a bit harder than just normal decomp as even simple things like using an if/else instead of a terney actually change the assembly. AI did an amazing job of getting to 95% matches on nearly all functions, but once it got to that tail end, it started to struggle quite a lot and would often just claim "it's impossible". So that's when I pivoted and started learning actual decomp myself so that I could prompt AI better and finish off the star fox adventures decomp!
USPTO and court precedent is leaning heavily toward LLM output not being transformative on its own, making it mechanical, and no longer fair use and in violation of copyright. This puts a legal gray cloud on a project where most contributors couldn’t defend themselves if a rights holder goes after it, and there’s a high likelihood that they would succeed. On the other hand there’s enough case law protecting human decompilation that even the most litigious game companies don’t go after decomp projects that have historically been done by humans.
(I’m not a lawyer, I’m not your lawyer, this is not legal advice, etc., etc.)
Nicalis and Take-Two have both gone after decompilation projects, also. In particular, Nicalis has gone after a decompilation of Cave Story, but not a black box reimplementation of the same, while Take-Two ended up suing a decompilation developer (albeit settled out of court). However, in some jurisdictions, even clean reimplementations have failed - see Tetris v. Xio.
(I am not a lawyer either, etc etc, but that's my understanding)
CSE2 was distributing binaries as well.
So was SM64 decomp and Nintendo told them to stop, they did and continued to share their source code.
Tetris v. Xio is unrelated to reverse engineering or decompilation.
Distributing binaries should not matter. If the binary is just compiled from the source code, the binary is just an (non-)infringing as the source code.
> They also allegedly violated a EULA
Meaningless. EULAs are not the law.
On the first lesson, it tells me there's a target on "the right". There isn't anything to the right, I've in clue where to look.
Basically; can you reverse engineer in bite sized pieces, and recompile/customize their behavior, without needing to do it all at once?
0 - for example: https://github.com/Xeeynamo/sotn-decomp/blob/master/src/boss...
[0] https://youtu.be/JmxGLo_itEY?is=x85epFYBcPeRDxxh
[1] https://www.shipofharkinian.com/faq
Also, how to folks obtain binaries? Presumably unless there is a source code breach or vulnerability, source never gets exposed, is thst correct?
Otherwise they reference rips of the original game.
Don't think this was the case, what helped was it being compiled without optimizations.
The toolchain would also be easier to match, unless they were using some proprietary compiler you can't get your hands on.
Just lookup how they match the toolchain, and find an agent harness to do decompilation.
I wonder if doing this kind of stuff with more recent software will cause more legal problems though. I am not really sure of the legal status of the resulting code.
Drivers can also be partially implemented / buggy depending on the device. (And I'm not talking about closed source stuff, even things like usb or touch screen drivers).
I would say that the major blocker is that nobody really cares about these devices and the task itself to mainline them is gigantic.
Most SOC have 3 or maybe 4 contributors maximum and the phonedev kernel mailing list gets a new patch only every 2 days. The whole phone mainlining community is probably less than 60 people.
Does this assume having access to the exact version of the compiler use, or can it be done with a different compiler in practice?
And do you care about things like binary layout, or just instruction match? (Does that ever matter in practice?)
Generally bit-for-bit equivalence to the original executable is expected. However I think for some cases where the original executable included debug info (eg. PS2 ELFs) then the unused-at-runtime sections need not match.
- Not a fan of the purple theme, it screams "AI-generated". It's not a deal breaker, you can keep it if you have more important concerns, but just something to point out
- It would be nice to have a "Chapter 0" for a primer on assembly syntax. Does not have to be interactive, a few toy examples I can work out on paper would be good enough
- Maybe I just haven't seen it, but it would be nice to have a reference of all the various instructions. Your lessens explain them well enough, but I would like to have a list of all of them at a glance so I can look up instructions from earlier chapters later.
Ironically, for myself at least, the least it looks AI, the more I've spent time prompting for it to not look AI.
- I've just added some new lessons before the first lesson to explain assembly syntax!
- I believe there are around 200-odd instructions, and a lot of them I'm not sure you'll ever see, but I can certainly look at adding something like this
Also, I wish there were a guide about how to start from nothing on a new GC game. That's more interesting to me than putting the finishing polish on a decomp project that already "works" functionally.