Hacker News Clone

Zerostack – A Unix-inspired coding agent written in pure Rust(crates.io)

220 pointsby gidellavMay 16, 2026

17 Comments

throwa356262•May 16, 2026

"RAM footprint: ~8MB on an empty session, ~12MB when working"

I like this, Claude Code is using multiple gigabytes, which is really annoying on lowend laptops

tecoholic•May 16, 2026

Yes. Just this fact is going to make a lot of people try it out.

marknutter•May 16, 2026

Isn't that because of the context window size?

SatvikBeri•May 16, 2026

The context window has nothing to do with RAM usage and even if it did, a million tokens of context is maybe 5mb.

vlovich123•May 17, 2026

It has nothing to do with local RAM usage. But a million tokens of LLM context is decidedly not 5mb.

The rough estimate is 2 * L * H_kv * D * bytes per element

Where:

* L = number of layers * H_kv = # of KV heads * D = head dimension * factor of 2 = keys + values

The dominant factor here is typically 2 * H_kv * D since it’s usually at least 2048 bytes. Per token.

For Llama3 7B youre looking at 128gib if you’re context is really 1M (not that that particular model supports a context so big). DeepSeek4 uses something called sparse attention so the above calculus is improved - 1M of context would use 5-10GiB.

But regardless of the details, you’re off by several orders of magnitude.

tujux•May 17, 2026

Pretty sure we're talking about the output text, not the tensors.

bluegatty•May 17, 2026

'A million tokens of context' is literally Terrabytes of KV cache VRAM on very expensive Nvidia silicon - on the model.

On the Agent, yes, the context window does relate to RAM, because the 'entire conversational history' is generally kept in memory. So ballpark 1M 'words' across a bunch of strings. It's not that-that much.

Claude Code is not inneficient because 'it's not Rust' - it's just probably not very efficiently designed.

Rust does not bestow magical properties that make memory more efficient really.

A bit more, but it's not going to change this situation.

'Dong it in Rust' might yield amazing returns just because the very nature of the activity is 'optimization'.

gidellav•May 16, 2026

Hi, I'm the developer of zerostack! No, the memory footprint is not beacuse of the context window size: on my benchmarks, with a 128k context loaded, and it jumped from 8MB (without any chat/context loaded) to 11MB.

The reasons why the memory footprint of zerostack are:

- Rust, and not JS/Python, so no interpreters/VMs on top

- Load-as-needed, so we only allocate things like LLM connectors when needed

- `smallvec` used for most of the array usage of the tool (up to N items are stored in stack)

- `compactstring` used for most of the string usage of the tool (up to N chars are stored in stack)

- `opt-level=z` to force LLVM to optimize for binary size and not for performance (even tho we still beat both in TTFT and in tool use time opencode)

- heavy usage of [LTO](https://en.wikipedia.org/wiki/Interprocedural_optimization#W...)

SwellJoe•May 16, 2026

The context window is not on your system. It's on the server with the model. There may be some local prompt caching, of some sort, but you're not locally hosting the context unless you're also locally hosting the model.

bluegatty•May 17, 2026

Chat history is kept locally, generally you have to send the 'whole history' to the model 'each turn'.

messh•May 17, 2026

The memory footprint is great, it allows finally running these coding agents in extra small instances -- say x1 on shellbox.dev

chrisweekly•May 17, 2026

Hmm, if they're this small something like smolmachines (like shellbox, but free and local) might be a great fit.

all2•May 17, 2026

I'm building an agent framework in golang and it is extremely light weight. Startup time is under 1/2 second, and RAM usage is really low. I have a 12 year old laptop and it happily runs without slowing down.

There's no reason what is essentially a string concat engine should be slow on any hardware, including old hardware.

rel•May 17, 2026

I've been trying to migrate over the zed and think they're Agent Client Protocol[1] is pretty neat, I wonder how much memory pressure Claude Code exerts if it is going through that mechanism instead

1: https://zed.dev/acp

esperent•May 17, 2026

Are you sure you don't have an LSP plugin or something running?

hparadiz•May 16, 2026

this is what I've been waiting for

a low level language. please no more scripting language TUIs!

iknowstuff•May 16, 2026

Isn’t codex in rust?

schaefer•May 16, 2026

There has been no reason to wait... Codex is written in rust.

-- So is deepseek-tui.

hparadiz•May 16, 2026

Forgot to add an open source qualifier. I use codex lol

andxor•May 16, 2026

Codex is also opensource.

hparadiz•May 16, 2026

I don't really want something owned by a company for my local stuff. I'd prefer it be small and minimalistic. Maybe in the future I'll change my mind and it will be more like a browser but for now I wanna keep it small and local.

gidellav•May 17, 2026

Thanks! I don't think that the only advantages are being open and lightweight, but you can actually find some more interesting features such as Ollama support, integrated Prompts (in order to compete with superpowers), git worktrees integration, and so on

nine_k•May 16, 2026

Rust, a language with affine types, generics, lifetimes, deep static analysis, hygienic macros, etc is not low-level. It's nearly as high-level as Haskell (without HKTs though).

It just does not rely on GC and allows to manage resources efficiently. This efficiency is partly due to its being so high-level.

gidellav•May 16, 2026

While I agree on the fact that it allows to manage resources efficiently, I don't agree on the fact the efficency derives from it being high-level; from a purely tecnical standpoint, i could skim off 2-3MB from the memory footprint by writing the code in pure C, as there are some unused parts of Rust's std that cannot be removed without recompiling std.

This is obv only a technical talk, as writing an AI TUI in pure C would be rather... ehhh

nine_k•May 17, 2026

That's why I said "part of its efficiency". Rust can do RAII, can optimize things more aggressively because of no aliasing ever in safe code, and because of known lifetimes, it can offer fearless concurrency™. Rust can also support highly optimized data representations (see how Optional works, or other ADTs, etc) which languages like Haskell, to say nothing of Python, cannot offer because of GC and boxing.

Lower-level languages like Zig or even Go, to say nothing of C, lack many of the high-level language features that power this efficiency.

onlyrealcuzzo•May 17, 2026

Agreed, Rust is way more expressive than people give it credit for.

sergiotapia•May 16, 2026

Given agent harnesses affect so much of the performance of models, it would be great to see some kind of benchmark on how this tool performs compared to claude/codex/opencode/pi etc.

gidellav•May 16, 2026

Hi! While I didn't try any agent benchmark, I already though of this possible issue, and I tried to approach it on two different levels:

1. The tools that are given to the agent are almost the same to the one defined in Opencode, except for Skills and Subagents (both features not implemented in zerostack)

2. Zerostack is prompt-based, so that it ships with a set of .md files, stored in ~/.config/zerostack/prompt, and that can be selected from the TUI in order to activate different 'agents': as you can see from the README, it is designed to contain the most important feautres of superpower + Claude's front-end design + git worktree support and Ralph Wiggum loops (both as integrated features)

esafak•May 17, 2026

It's been said before, but it is important to prospective users, so it bears repeating: screenshots and benchmarks, please; it helps users decide whether to invest time in it. The ability to transfer settings from other agents would be great too.

gidellav•May 17, 2026

1. I will add some screenshots tomorrow

2. As said before, there are no benchmarks right now, but it is good enough for me, so I hope it's good enough for y'all :)

3. Transfering settings from other agents is out-of-scope for a minimalstic coding agent, but the idea is that, apart from MCP server, the rest might just force you to learn how zerostack works, because of design choices such as not having Skills or having certain specialized tools integrated (worktrees and loops).

hiAndrewQuinn•May 16, 2026

The codebase was small enough that I handed it over to DeepSeek v4 Flash in Pi to skim through for any risky business, and I didn't find anything concerning. Nice work.

gidellav•May 16, 2026

Thanks! Funny enough, a good chunk of the coding was done by Deepseek v4 Flash, while I hand-wrote a couple of the TUI logic, as deepseek kept failing on certain cursor-moving logic, and I fully managed the memory optimization process (as you can read on another comment I left, it both a set of compiler optimizations and usage of certain Rust crates in order to leverage more efficient data structures).

hiAndrewQuinn•May 16, 2026

Taking notes and comparing this against my own (non coding agent) Rust TUI project, thank you! I'm new to Rust so this is a helpful baseline.

gidellav•May 17, 2026

No problem, happy to help!

kadoban•May 16, 2026

> I handed it over to DeepSeek v4 Flash in Pi to skim through for any risky business

Doesn't prompt injection make that a rather flimsy investigation?

koito17•May 17, 2026

Since the OP stated they used DeepSeek V4 Flash for generating a lot of the code, I decided to check whether there were any outdated dependencies. In my experience, with Rust projects, if you do not instruct models (even Claude 4.7 Opus) to use `cargo add` instead of manually editing the Cargo.toml, you will almost certainly get out-of-date dependencies added to your project.

Manually checking the dependencies used by this project, I was pleased to see they are all the latest version. That doesn't mean there are no issues lurking in transitive dependencies, of course.

As for getting an LLM to review the code, I think we can get all opinionated very fast. For instance, when I was eyeballing the code, some of the enum methods converting to/from strings made me think "this could've been a single #[derive] with strum." That would make the code in provider.rs a lot more concise, at the cost of importing one crate (with no dependencies!)

Lastly, for fun, I decided to get DeepSeek V4 Pro (with Max thinking) to "audit" the codebase. The output mentioned no obvious signs of hidden telemetry, but it did note that the project sets the panic handler to "abort", which I have strong opinions on... Presumably the OP wanted to avoid linking against libunwind to save a few kilobytes of binary size, but now you have a binary that immediately aborts and doesn't give the user a stacktrace of what just crashed. I would rather have a ~50 KiB larger binary if it means getting useful debug info during a panic. Additionally, if there are async tasks that panic, they can't be recovered to display a generic error message; instead the whole process just aborts.

hiAndrewQuinn•May 17, 2026

Hidden telemetry was my big concern, yes; the abort thing wasn't caught as a security thing by DeepSeek V4 Flash but it was mentioned by Claude 4.7 Opus (I wanted to compare and contrast here), and Flash brought it up later when I asked it about performance tuning.

`cargo add` tip is very helpful, I had a hunch this happened in my own Rust project and I think you just filled in the missing piece for me there.

vlovich123•May 17, 2026

To me panic=abort is much safer security as it means you’re unlikely to enter weird states due to incorrectly handled unwinding. The only attack vector is a DOS attack which is a short term thing that’s easily rectified.

gidellav•May 17, 2026

Hi, nice comment!

1. I had experience not only with wrong versions selected by the agents, but also weird crates (ex. choosing a crate with 10 github stars when a more complete and more supported one was available), reason why now I always choose the dependencies and then I let the agent work.

2. Yes, some of the provider code could be made using macros, I am just lazy... But thanks for the tip! I will save it for later.

3. No telemetry, and it can be checked thanks to the fact that there are no HTTP calls outside of the MCP implementation (via rmcp) and LLM connectors (via rig)

4. Yes, i set panic handler to 'abort', thinking that I would've get a nice size decrease: i yet have to experience a panic on this project, but I will revert it to default behavior if the binary size saving is really so small

5. While it is async, the entire project runs on one thread (as expressed in the main.rs with ```#[tokio::main(flavor = "current_thread")]```), as it allows for a nice ~8MB memory saving (so, 50% off) and no real performance loss, being such a simple tool.

---

P.S. Just switched back to default settings for panic handler

360MustangScope•May 16, 2026

Funny this comes out today. I was just about to start to write one in rust. It's amazing having opencode slowly leak memory and end up becoming 6gbs on a large project and then get slower and slower.

Will check this out! Seems cool!

gidellav•May 17, 2026

Yes! This project derived from an OOM killer activation that happened on my old laptop beacuse i had more than 2 opencode instances open together with Firefox...

khimaros•May 16, 2026

i built something with a similar philosophy here: https://github.com/khimaros/airun -- it is intended to be piped and redirected. it discovers skills, AGENTS and prompt templates from Claude Code, Pi.dev, OpenCode and others. no TUI, but does have a basic tool calling loop

$ airun -q -p 'output a shell command for linux to display the current time. output only the command with no other code fencing or prose' | airun -q -s 'review the provided shell command, determine if it is safe, run it only if it is safe, and then summarize the output from the command' --permissions-allow='bash:date *'

gidellav•May 17, 2026

While I think that the core philosohpy is the same, i'd like to ask: why adding features like Skills and prompt templates?

I personally decided to not implement Skills and instead using a prompt library approach, where certain .md are used to fully replace the system prompt, in order to allow for an approach similar to Skills with ~100 LoC dedicated to this system.

c-hendricks•May 17, 2026

Aren't skills fairly easy to share, and can contain more than one file?

desireco42•May 17, 2026

Prompts as well... he might be on to something here, can't say as I didn't try it yet

Skills are just prompts

hedgehog•May 17, 2026

Most of mine have code in them. That's most of the value.

afzalive•May 17, 2026

Isn't the key thing with skills that the description is used to match them from a prompt that doesn't mention them?

Would a prompt library do that too?

frio•May 17, 2026

Thanks, I've been tooling away in my spare time on my own version of this -- both to get a deeper understanding of agents (everyone suggests writing your own) and to help learn Rust. I'd like to retain `pi`'s configurability though, the ability to self-mutate and generate new tools is incredibly useful, particularly because I don't think any of these things should have access to arbitrary code execution through `bash` (of course, if they have access to, say, `edit` and `cargo run` they still have arbitrary code exec, but...) (so I tend to generate tools on the fly when I encounter something the no-bash agent needs to do).

gidellav•May 17, 2026

I actually though about this issue, but while Pi can have this script-like environment thanks to the fact that it's based on an interpreted language (TypeScript), Rust has its own limitation as a compiled language.

I decided to allow for customization in a different way:

1. The prompt library (~/.config/hypernova/prompts/) acts as a simpler alternative to Skills, with the built-in prompts that should replace superpowers + Claude's frontend-design

2. Compile-time features; things that might make the agent more bloated can be disabled when you decide to compile zerostack

3. Clean code; code that's short and easy to read, you can just throw zerostack on its own source code in order to build a custom fork if your necessity can't be satisfied. Good features could also be adopted by the main version.

4. Permission mode; as you can see in the README, there was lots of concern around the permission model, and I landed on a 4-mode system that goes from "Restrictive" (no commands) to "YOLO" (whatever the agent wants to do" + custom regex patterns for allow/ask/deny permission on 'bash' calls. In your case, you just need to run `zerostack -R` to force all tools to ask for permission.

(Also, there is a work-in-progress features for programmable agents, but that's yet to be announced)

frio•May 17, 2026

I've been trying to use `Deno` underneath `Rust` so that the tools can still be written in Typescript and thus self-mutated without the compilation step (but I can still try to do clever things with V8 Isolates or similar). It's been an ugly experiment so far; I'm vaguely thinking a simpler model would be to just define a binary "API" and run tools by exec-ing binaries.

gidellav•May 17, 2026

I have to be honest and tell you that try to load such an heavy runtime as a scripting layer is not a great idea; at the same time I can tell you that I am working on another Rust project where I also needed scripting, and after three attempts I landed on rhai (https://rhai.rs/) (https://rhai.rs/book).

You might find it nice for pretty much all use cases except for high-performance scripting (so, if you are not try to build the entire logic entirely in rhai, you are going to be fine).

frio•May 17, 2026

Yeah, it's been a bit of a dead end. I didn't want the heavy runtime but felt it was worth disproving after experimenting rather than ruling out off the bat. Even before getting it running, the dependency list alone was pretty discouraging, especially given the storm of supply chain attacks these days.

Rhai looks nice, I'll take a look, thanks! And good luck with Zerostack.

slopinthebag•May 17, 2026

I was just going to suggest rhai. It's simple enough LLMs can easily write it with a little context, and you control the entire API so you can sandbox effectively without needing to resort to hacks with a JS interpreter etc.

jswny•May 17, 2026

Why not WASM?

frio•May 17, 2026

Unfamiliarity and I believe it requires a compile step. I’m at least familiar with Typescript and Deno so being able to embed them was an appealing idea :)

BillStrong•May 17, 2026

Have you thought about Zig? If you limit it to CompTime, isn't that just a scripting language that happens to be compiled to binary?

praveer13•May 17, 2026

I’ve been doing the same thing in zig haha.

inciampati•May 17, 2026

> Integrated Ralph Wiggum loops: looping capabilities for long-horizon tasks

Imo, this shouldn't be embedded in the executor layer. Orchestration should handle this.

gidellav•May 17, 2026

I get you, but when I decided to follow a no-skills approach (as in, no agent's Skills used), I had to decide what:

1. Couldn't be built only using prompts

2. Couldn't be built only using MCP servers

3. Would have improved my UX experience (as i hope, your UX experience).

From those three conditions, I chose integrated git worktrees and loops

brcmthrowaway•May 17, 2026

!RemindMe 6 months

andrew_kwak•May 17, 2026

Been hearing a lot about Rust lately. I'm curious how Zerostack handles concurrency compared to more traditional Unix tools. Anyone tried it for something CPU-intensive?

usernametaken29•May 17, 2026

Now make it into an IntelliJ plugin which has proper access to the search index. I’ll pay for it. For Christs sake it’s insane JetBrains hasn’t figured this out yet

nullorempty•May 17, 2026

I think this is such an opportunity for JetBrains. I talked to them about this at AWS Re-Invent, strangely, they could really see how strong of a position they are in if only they paid attention to the right thing!

usernametaken29•May 17, 2026

They even have this already, Junie, but of course the plugin version cannot use BYOK….

parhamn•May 17, 2026

I (somewhat jokingly) wrote one recently too... https://github.com/pnegahdar/nano in under 200 lines. Repl, sessions, non-interactive, approvals, etc

The smarter the models get the less the harnesses matter (outside of devx).

Maybe one day I'll run it through swebech.

mgfist•May 17, 2026

I like it

freakynit•May 17, 2026

So freaking cool..in just 200 (190 actually) lines.

I also wrote one by myself last week (just for fun and learning). It works, including integration with configured mcpServers (like you do in most coding agents). Wrote about the whole step-by-step process and what is needed at what step and why: https://nb1t.sh/building-a-real-agent-step-by-step/

slopinthebag•May 17, 2026

I love these. Coding agents aren't very difficult to build, it's a TUI + tools + getting a nice agent loop working. The hardest part seems to be supporting all of the different providers and model quirks. What is interesting is seeing the experimentation: some provide tons of tools, others provide a single python interpreter and have the agent use tools via sandboxed python scripts, others use minimal tools and lean on bash. Personally I want a harness that gives a ton of control to the user to let them steer the LLM, less agent and more augmentation. Maybe I'll have to build it myself. If anyone has ideas, let me know.

afzalive•May 17, 2026

Pi.dev is pretty good in giving tons of control to the use and has extensions that you can easily build.

Although people are complaining about its RAM usage in this thread, I haven't bothered to check how much RAM it uses.

theusus•May 17, 2026

I absolutely like this. Pi becomes sluggish after installing a couple of extensions. I myself was trying to port Pi to Rust but it was consuming too much tokens.

Is there any API like Pi so that I can create extensions.

esperent•May 17, 2026

[delayed]

choopachups•May 17, 2026

dude, im actually in disbelief how long we put up with the pile of shit that is claude code.

noodletheworld•May 17, 2026

Are agent harnesses the new web framework?

Everyone wants to write one, building a new one is easy to start with, but tough to get to “prod ready” and the landscape is littered with failed attempts?

Certainly feels like it.

This is really good though; works well and at least has a clearly articulated raison d'être.

deagle50•May 17, 2026

Looks promising, is OpenAI subscription support planned?