153 pointsby thunderbongApr 4, 2026

26 Comments

sgbealApr 4, 2026
> json_extract returns native types. json_extract(data, '$.id') returns an integer if the value was stored as a number. Comparing it to a string silently fails. Always CAST(json_extract(...) AS TEXT) when you need string comparison.

More simply:

    sqlite> select typeof('{a:1}'->>'a') ;
    ╭──────────────────────╮
    │ typeof('{a:1}'->>... │
    ╞══════════════════════╡
    │ integer              │
    ╰──────────────────────╯
vs:

    sqlite> select typeof('{a:1}'->'a') ;
    ╭──────────────────────╮
    │ typeof('{a:1}'->'a') │
    ╞══════════════════════╡
    │ text                 │
    ╰──────────────────────╯
yokuzeApr 4, 2026
> The technical fix was embarrassingly simple: stop pushing to main every ten minutes.

Wait, you push straight to main?

> We added a rule — batch related changes, avoid rapid-fire pushes. It's in our CLAUDE.md (the governance file that all our AI agents follow):

> Avoid rapid-fire pushes to main — 11 pushes in 2h caused overlapping Kamal deploys with concurrent SQLite access.

Wait, you let _Claude_ push your e-commerce code straight to main which immediately results in a production deploy?

crabmusketApr 4, 2026
Patient: doctor, my app loses data when I deploy twice during a 10 minute interval!

Doctor: simply do not do that

pavel_lishinApr 4, 2026
Doctor: solution is simple, stop letting that stupid clown Pagliacci define how you do your work!

Patient: but doctor,

pjc50Apr 7, 2026
pAIgliacci: as a large language model, I am unable to experience live comedy.
rcakebreadApr 7, 2026
tensegristApr 7, 2026
i hate to be so blunt but look around the site and then tell me you're surprised
bombcarApr 7, 2026
Hey, Apple still takes their store down during product launches!
pstuartApr 7, 2026
I assumed that it was to ensure that the announced products were revealed in a controlled manner rather than because they aren't able to do updates to their product listings as a regular thing.
bombcarApr 7, 2026
My reading of the tea leaves is it started out as the latter and continues as the former as part of the “mystique”.
xnorswapApr 7, 2026
I'm fairly confident they let it write the blog post too.
simonwApr 7, 2026
"Not as a proof of concept. Not for a side project with three users. A real store" - suggestion for human writers, don't use "not X, not Y" - it carries that LLM smell whether or not you used an LLM.
xnorswapApr 7, 2026
And that's just the opening paragraph, the full text is rounded off with:

"The constraint is real: one server, and careful deploy pacing."

Another strong LLM smell, "The <X> is real", nicely bookends an obviously generated blog-post.

These335Apr 7, 2026
You're absolutely right, this was an AI post
chasilApr 7, 2026
This is the actual problem:

"Kamal runs blue-green deploys — it starts a new container, health-checks it, then stops the old one. During the switchover, both containers are running. Both mount ultrathink_storage. Both have the SQLite files open."

WAL mode requires shared access to System V IPC mapped memory. This is unlikely to work across containers.

In case anybody needs a refresher:

https://en.wikipedia.org/wiki/Shared_memory

https://en.wikipedia.org/wiki/CB_UNIX

https://www.ibm.com/docs/en/aix/7.1.0?topic=operations-syste...

Retr0idApr 7, 2026
> This is unlikely to work across containers.

Why not?

simonwApr 7, 2026
Thanks for this, the anecdote with the lost data was very concerning to me.

I think you're exactly right about the WAL shared memory not crossing the container boundary. EDIT: It looks like WAL works fine across Docker boundaries, see https://news.ycombinator.com/item?id=47637353#47677163

I don't know much about Kamal but I'd look into ways of "pausing" traffic during a deploy - the trick where a proxy pretends that a request is taking another second to finish when it's actually held in the proxy while the two containers switch over.

From https://kamal-deploy.org/docs/upgrading/proxy-changes/ it looks like Kamal 2's new proxy doesn't have this yet, they list "Pausing requests" as "coming soon".

chasilApr 7, 2026
You might consider taking the database(s) out of WAL mode during a migration.

That would eliminate the need for shared memory.

Retr0idApr 7, 2026
> I think you're exactly right about the WAL shared memory not crossing the container boundary.

I don't, fwiw (so long as all containers are bind mounting the same underlying fs).

hedoraApr 7, 2026
It would explain the corruption:

https://sqlite.org/wal.html

The containers would need to use a path on a shared FS to setup the SHM handle, and, even then, this sounds like the sort of thing you could probably break via arcane misconfiguration.

I agree shm should work in principle though.

PunchyHamsterApr 7, 2026
Not how SQLite works (any more)

> The wal-index is implemented using an ordinary file that is mmapped for robustness. Early (pre-release) implementations of WAL mode stored the wal-index in volatile shared-memory, such as files created in /dev/shm on Linux or /tmp on other unix systems. The problem with that approach is that processes with a different root directory (changed via chroot) will see different files and hence use different shared memory areas, leading to database corruption. Other methods for creating nameless shared memory blocks are not portable across the various flavors of unix. And we could not find any method to create nameless shared memory blocks on windows. The only way we have found to guarantee that all processes accessing the same database file use the same shared memory is to create the shared memory by mmapping a file in the same directory as the database itself.

simonwApr 7, 2026
I just tried an experiment and you're right, WAL mode worked fine across two Docker containers running on the same (macOS) host: https://github.com/simonw/research/tree/main/sqlite-wal-dock...

Could the two containers in the OP have been running on separate filesystems, perhaps?

Retr0idApr 7, 2026
Perhaps they're using NFS or something - which would give them issues regardless of container boundaries.
jmullApr 7, 2026
I dug into this limitation a bit around a year ago on AWS, using a sqlite db stored on an EFS volume (I think it was EFS -- relying on memory here) and lambda clients.

Although my tests were slamming the db with reads and write I didn't induce a bad read or write using WAL.

But I wouldn't use experimental results to override what the sqlite people are saying. I (and you) probably just didn't happen to hit the right access pattern.

hedoraApr 7, 2026
Pausing requests then running two sqlites momentarily probably won’t prevent corruption. It might make it less likely and harder to catch in testing.

The easiest approach is to kill sqlite, then start the new one. I’d use a unix lockfile as a last-resort mechanism (assuming the container environment doesn’t somehow break those).

simonwApr 7, 2026
I'm saying you pause requests, shut down one of the SQLite containers, start up the other one and un-pause.
gcrApr 7, 2026
The SQLite documentation says in strong terms not to do this. https://sqlite.org/howtocorrupt.html#_filesystems_with_broke...

See more: https://sqlite.org/wal.html#concurrency

Retr0idApr 7, 2026
They tell you to use a proper FS, which is largely orthogonal to containerization.
jmullApr 7, 2026
WAL relies on shared memory, so while a proper FS is necessary, it isn't going to help in this case.
fauigerzigerkApr 7, 2026
Why does it not help if both containers can mmap the same -shm file?
merbApr 7, 2026
btw nfs that is mentioned here is fine in sync mode. However that is slow.
PunchyHamsterApr 7, 2026
> WAL mode requires shared access to System V IPC mapped memory.

Incorrect. It requires access to mmap()

"The wal-index is implemented using an ordinary file that is mmapped for robustness. Early (pre-release) implementations of WAL mode stored the wal-index in volatile shared-memory, such as files created in /dev/shm on Linux or /tmp on other unix systems. The problem with that approach is that processes with a different root directory (changed via chroot) will see different files and hence use different shared memory areas, leading to database corruption."

> This is unlikely to work across containers.

I'd imagine sqlite code would fail if that was the case; in case of k8s at least mounting same storage to 2 containers in most configurations causes K8S to co-locate both pods on same node so it should be fine.

It is far more likely they just fucked up the code and lost data that way...

voidfuncApr 7, 2026
Ooh new historical Unix variant I had never heard of.. neat!
chasilApr 7, 2026
AIX is still supported and sold, so quite current?

Some that I used that are gone... Ultrix (MIPS), Clix, Irix, SunOS 4, SCO OpenServer, TI System V.

https://en.wikipedia.org/wiki/Ultrix

https://en.wikipedia.org/wiki/Intergraph

nxobjectApr 7, 2026
NeXTstep? (Leaving aside fun spitballing about whether Tahoe is morally OPENSTEP 26, and whether it was NeXT that actually bought Apple for negative $400 million...)
littlestymaarApr 7, 2026
> Wait, you let _Claude_ push your e-commerce code straight to main which immediately results in a production deploy?

Yikes. Thank you I'm not going to read “Lessons learned” by someone this careless.

66yatmanApr 7, 2026
The issue wasn’t done by the ai but their lack of architectural knowledge
burnt-resistorApr 7, 2026
I suspect they don't wear helmets or seatbelts either. Sigh. The "I'm so proud and ignorant of unnecessarily risky behaviors" meme is tiring.

The Meta dev model of diff reviews merge into main (rebase style) after automated tests run is pretty good.

Also, staging and canary, gradual, exponential prod deployment/rollback approaches help derisk change too.

Finally, have real, tested backups and restore processes (not replicated copies) and ability to rollback.

leosanchezApr 4, 2026
> Backups are cp production.sqlite3 backup.sqlite3

I use gobackup[0] as another container in compose.yml file which can backup to multiple locations.

[0]: https://gobackup.github.io/

hedoraApr 7, 2026
Does cp actually work on live sqlite files? I wouldn’t expect it to, since cp does not create a crash-consistent snapshot.
sgbealApr 7, 2026
> Does cp actually work on live sqlite files? I wouldn’t expect it to, since cp does not create a crash-consistent snapshot.

cp "works" but it has a very strong possibility of creating a corrupt copy (the more active the db, the higher the chance of corruption). Anyone using "cp" for that purpose does not have a reliable backup.

sqlite3_rsync and SQLite's "vacuum into" exist to safely create backups of live databases.

faangguyindiaApr 4, 2026
I've a busy app, i just deploy to canary. And use loadbalancer to move 5% traffic to it, i observe how it reacts and then rollout the canary changes to all.

how hard and complex is it to roll out postgres?

pezh0reApr 4, 2026
Not hard at all - geerlingguy has a great Ansible role and there are a metric crapton of guides pre-AI/2022 that cover gardening.
cadamsdotcomApr 4, 2026
The fix appears to nicely asking the forgetful unreliable agent to please (very closely pretty please!) follow the deploy instructions (and also please never hallucinate or mess up, because statistics tells us an entity with no long term memory and no incentive to get everything right will do the job right 99.99999999% of the time, which is good enough to run an eshop) not deploy too often per hour.

With one simple instruction the system (99.9999% of the time) gains the handy property that “only” two processes end up with the database files open at once.

Thanks for the vibes!

devmorApr 7, 2026
I have to work with agents as a part of my job and the very first thing I did when writing MCP tools for my workflow was to ensure they were read only or had a deterministic, hardcoded stopgap that evaluates the output.

I do not understand the level of carelessness and lack of thinking displayed in the OP.

mywittynameApr 7, 2026
Even just having the agent write scripts to disk and run those works wonders. It keeps the agent from having to rebuild a script for the same tasks, etc.
devmorApr 7, 2026
That too! Every time the agent does something I didn't intend, I end up making a tool or process guidance to prevent it from happening again. Not just add "don't do that" to the context.
politelemonApr 4, 2026
> embarrassingly simple

This is becoming the new overused LLM goto expression for describing basic concepts.

NatfanApr 4, 2026
llm generated article.

please consider writing it yourself. quirks in human writing is infinitely more interesting than a next-token-predicted 500 word piece

NewsaHackOApr 7, 2026
But then how would they get people to buy their $99 AI CEO package?
pullshark91Apr 7, 2026
Huh, and here I thought it was a joke...
NewsaHackOApr 7, 2026
Maybe it is, didn't really look into it.
jmullApr 4, 2026
Redis, four dbs, container orchestration for a site of this modest scope… generated blog posts.

Our AI future is a lot less grand than I expected.

ramon156Apr 7, 2026
How else will you get all those resume entries ! (/j)
add-sub-mul-divApr 7, 2026
Ironically, AI de-skilling results in a robust-sounding resume.
jszymborskiApr 4, 2026
The LLM prose are grating read. I promise, you'd do a better job yourself.
littlestymaarApr 7, 2026
Given how dumb their workflow is (let Claude Code push directly to production without supervision) I'm not so sure.
infamiaApr 4, 2026
SQLite has a ".backup" command that you should always use to backup a SQLite DB. You're risking data loss/corruption using "cp" to backup your database as prescribed in the article.

https://sqlite.org/cli.html#special_commands_to_sqlite3_dot_...

anonzzziesApr 7, 2026
Yeah, using cp to backup sqlite is a very bad idea. And yet, unless you know this, this is what Claude etc will implement for you. Every friggin' time.
chasilApr 7, 2026
It's fine if you run the equivalent of "init 1" first.

Does your OS have a single-user mode?

BartjeDApr 7, 2026
The bottom part of the article mentions they use .backup - did they add that later or did you miss it?
BartjeDApr 7, 2026
The post now says they changed it due to feedback from Hacker news. All good.
qingcharlesApr 7, 2026
"I know about the .backup command, there's no way I'm using cp to backup the SQLite db from production."

Oh.

Guess I know what I'm fixing before lunch. Thank you :)

warmwafflesApr 7, 2026
Yes, especially if you are using a WAL.
rogerbinnsApr 7, 2026
Related, there is also sqlite3_rsync that lets you copy a live database to another (optionally) live database, where either can be on the network, accessed via ssh. A snapshot of the origin is used so writes can continue happening while the sqlite3_rsync is running. Only the differences are copied. The documentation is thorough:

https://sqlite.org/rsync.html

crazygringoApr 7, 2026
> Would We Choose SQLite Again? Yes. For a single-server deployment with moderate write volume, SQLite eliminates an entire category of infrastructure complexity. No connection pool tuning. No database server upgrades. No replication lag.

These are weird reasons. You can just install Postgres or MySQL locally too. Connection pool tuning certainly isn't anything you have to worry about for a moderate write volume. You don't ever need to upgrade the database if you don't want to, since you're not publicly exposing it. There's obviously no replication lag if you're not replicating, which you wouldn't be with a single server.

The reason you don't usually choose SQLite for the web is future-proofing. If you're totally sure you'll always stay single-server forever, then sure, go for it. But if there's even a tiny chance you'll ever need to expand to multiple web servers, then you'll wish you'd chosen a client-server database from the start. And again, you can run Postgres/MySQL locally, on even the tiniest cheapest VPS, basically just as easily as using SQLite.

xnorswapApr 7, 2026
Yeah, it's weird "they" don't consider any middle ground between SQLite and replicated postgres cluster.

Locally running database servers are massively underrated as a working technology for smaller sites. You can even easily replicate it to another server for resiliency while keeping the local performance.

talkingtabApr 7, 2026
This. Spinning up Postgresql is easy once you know how. Just as SQLITE3 is easy once you know how. But I can find no benefit from not just learning postgres the first time around.
kaibeeApr 7, 2026
They're using AI Agents to do it in either case and using docker. There was no reason to choose SQLite.
kaibeeApr 7, 2026
Yeah a PG Docker container is basically magic. I too went down a rabbit-hole of trying to setup a write-heavy SQLite thing because my job is still using CentOS6 on their AWS cluster (don't ask). Once I finally got enough political capital to get my own EC2 box I could put a PG docker container on, so much nonsense I was doing just evaporated.
NewEntryHNApr 7, 2026
It's a spectrum. Installing Postgres locally is not 100% future-proofing since you'll still need to migrate your local Postgres to a central Postres. Using Sqlite is not 0% future-proofing since it's still using the SQL standard.

If the only argument for a piece of tech in comparison to another one is "future-proofing", that's pretty much acknowledging the other one is simpler to setup and maintain.

crazygringoApr 7, 2026
> It's a spectrum.

For web servers specifically, no, SQLite is not generally part of that spectrum. That makes as much sense as saying that in a kitchen, you want a spectrum of knives from Swiss Army Knives to chef's knives. No -- Swiss Army Knives are not part of the spectrum. For web servers, you do have a wide spectrum of database options from single servers to clusters to multi-region clusters, along with many other choices. But SQLite is not generally part of that spectrum, because it's not client-server.

> since you'll still need to migrate your local Postgres to a central Postres

No you don't. You leave your DB in-place and turn off the web server part. Or even if you do want to migrate to something beefier when needed, it's basically as easy as copying over a directory. It's nothing compared to migrating from SQLite to Postgres.

> since it's still using the SQL standard.

No, every variant of SQL is different. You'll generally need to review every single query to check what needs rewriting. Features in one database work differently from in another. Most of the basic concepts are the same, and the basic syntax is the same, but the intermediate and advanced concepts can have both different features and different syntax. Not to mention sometimes wildly different performance that needs to be re-analyzed.

> that's pretty much acknowledging the other one is simpler to setup and maintain.

No it's not. What logic led you there...? They're basically equally simple to set up and maintain, but one also scales while the other doesn't. That's the point.

The main advantage of SQLite has nothing to do with setup and maintenance, but rather the fact that it is file-based and can be integrated into the binary of other applications, which makes it amazing for locally embedded databases used by user-installed applications. But these aren't advantages when you're running a server. And it becomes a problem when you need to scale to multiple webservers.

pullshark91Apr 7, 2026
OMG, you just killed it.
runakoApr 7, 2026
Have run PG, MySQL, and SQLite locally for production sites. Backups are much more straightforward for SQLite. They are running Kamal, which means "just install Postgres" would also likely mean running PG in a container, which has its own idiosyncrasies.

SQLite is not a terrible choice here.

crazygringoApr 7, 2026
> Backups are much more straightforward for SQLite.

Not sure how? All of them can be backed up with a single command. But if you want live backups (replication) as opposed to daily or hourly, SQLite is the only one that doesn't support that.

nop_slideApr 7, 2026
I still haven't figured out a good way to due blue/green sqlite deploys on fly.io. Is this just a limitation of using sqlite or using Fly? I've been very happy with sqlite otherwise, rather unsure how to do a cutover to a new instance.

Anyone have some docs on how to cutover gracefully with sqlite on other providers?

wolttamApr 7, 2026
You accept downtime. That's the limitation of SQLite.

Or you use some distributed SQLite tool like rqlite, etc

nop_slideApr 7, 2026
I'm personally fine with a little bit of downtime for my particular small app. I'm just surprised there's not a more detailed story around deploying sqlite in a high availability prod environment given it's increased popularity and coverage over the last few years. Especially surprising with Rails' (my stack) going full "sqlite-first".
jmullApr 7, 2026
I don't know if it's just me, but this whole post seems to have time traveled forward from about 3-4 days ago.

It's not just a repost. The thread includes a comment I made at the time which now from "1 hour ago".

Makes me wonder if it's an honest bug or someone has hacked the hacker news front page to sell their t-shits, mugs, and AI starter kits.

Retr0idApr 7, 2026
It's an artefact of the "second chance pool" mechanism.
worksonmineApr 7, 2026
Interesting choice to change the time of the comment, a deja-vu can be weird enough without staring at a comment with a recent timestamp.
trelliumDApr 7, 2026
could have used firebird embedded, also a simple deployment such as sqllite, but better concurrency and more complete system, also a tad faster
NicoJuicyApr 7, 2026
If Nico send him an email. The AI CEO should take his offer.
mt42orApr 7, 2026
NIH syndrome, almost mental health issues.
jp0001Apr 7, 2026
I took three weeks off from tech, read books from last century, and travelled Europe. Coming back, reading LLM generated content and code feels like nails on a chalkboard. Taste, it does not have taste.
PunchyHamsterApr 7, 2026
It is so tiring...
literallyroyApr 7, 2026
It’s strange how easy it is to spot.
66yatmanApr 7, 2026
Just use a 4gb server and install Postgres
PunchyHamsterApr 7, 2026
> Yes. For a single-server deployment with moderate write volume, SQLite eliminates an entire category of infrastructure complexity. No connection pool tuning. No database server upgrades. No replication lag.

None of these is needed if you run sqlite sized workloads...

I like SQLite but right tools for right jobs... tho data loss is most likely code bug

MagicMoonlightApr 7, 2026
Slopcoded article for a Slopcoded website
heikkilevantoApr 7, 2026
I use SqLite for a small hobby project, fine for that. Wanted to read the article to see why I should not, but it attacked me with a "subscribe" popup, so I stopped there. The comments here seem to be based on daydreaming on scaling to a lot of users who need 24/7 uptime, which is not always the case.
kristiandupontApr 7, 2026
SQLite is a rock solid piece of software that offers a great value prop: in-process database. For locally running apps (desktop or mobile), this makes perfect sense.

However, I genuinely don't see the appeal when you are in a client/server environment. Spinning up Postgres via a container is a one-liner and equally simple for tests (via testcontainers or pglite). The "simple" type system of SQLite feels like nothing but a limitation to me.

adobrawyApr 7, 2026
If the problem is excessive deployments via GitHub Actions, why not use concurrency control on GitHub Actions ( https://docs.github.com/en/actions/how-tos/write-workflows/c... ) instead of relying on agent randomness and the hope that it won't make the same mistake again? Am I missing something?
rienbdjApr 7, 2026
A well designed system shouldn’t drop orders?

If you perform at least once processing then use Stripe idempotency keys you avoid such issues?

mattrighettiApr 7, 2026
I see tons of articles like this, and I have no doubt sqlite proved to be a great piece of software in production environments, but what I rarely find discussed is that we lack tools that enable you to access and _maintain_ SQLite databases.

It's so convenient to just open Datagrip and have a look at all my PostgreSQL instances; that's not possible with sqlite AFAIK (not even SSH tunnelling?). If something goes wrong, you have to SSH into the machine and use raw SQL. I know there are some cool front-end interfaces to inspect the db but it requires more setup than you'd expect.

I think that most people give up on sqlite for this reason and not because of its performance.

simonwApr 7, 2026
I have a project to help with that:

  uvx datasette data.db
That starts a web app on port 8001 that looks like this:

https://latest.datasette.io/fixtures

siruwastakenApr 7, 2026
Am I the only one finding this article highly suspect? It seems like the errors made are so basic, i.e. using the wrong SQL dialect for the db system in use, and there orders were apparently only at 17?