This seems great in concept, and totally infeasible.
But if anyone can do it, unicode seems like a great candidate.
Does anyone have reason for more optimism?
hobofan•Feb 16, 2026
Care to explain why you think it's infeasible? Then one could provide targeted counter-optimism ;)
I don't see what's infeasible about it. It doesn't seem too different from .po files (gettext catalogs) meshed with hooks for post-processing as would see in e.g. a handlebars, both of which have individually found great adoption.
bmn__•Feb 16, 2026
> why you think it's infeasible?
GP based his opinion on the assumption that this spec new and no implementations for it exist.
zbraniecki•Feb 16, 2026
ICU4C and ICU4J have implementations. We also have a JS polyfill and will be working on ICU4X impl this quarter.
junon•Feb 16, 2026
Unicode consortium already manages a ton of language specs. If there's any group of folks I'd trust to understand languages (natural or otherwise), it's them.
Cthulhu_•Feb 16, 2026
This is the one. Think of all the "misconceptions developer have about X" lists, I trust Unicode to have encountered (if not written) all of them. The people behind unicode are thorough.
I've been using this format for almost 10 years, and I only see increasing adoption. Why would I be pessimistic?
BoppreH•Feb 16, 2026
The meeting notes in the repo was a nice surprise. Overall looked great, striking a good balance.
.input {$var :number maximumFractionDigits=0}
.local $var2 = {$var :number maximumFractionDigits=2}
.match $var2
0 {{The selector can apply a different function to {$var} for the purposes of selection}}
* {{A placeholder in a pattern can apply a different function to {$var :number maximumFractionDigits=3}}}
Oof, that's a programming language already. And new syntax to be inevitably iterated on. I feel like we have too many of those already, from Python f-strings to template engines.
I wish it'll at least stay small: no nesting, no plugins, no looping, no operators, no side effects or calls to external functions (see Log4J).
silvestrov•Feb 16, 2026
English has just singular and plural: one car, two cars, three cars (and zero cars).
Some languages have more variations. E.g. Czech, Slovene and Russian has 1, 2-4 and 5 as different cases.
Personally I think the syntax is too brittle. It looks too much like TeX code and it has the lisp like deal with lines ending with too many } braces.
I would separate it into two cases: simple strings with just simple interpolation and then a more fuller markup language, more like a simplified xml.
I often wonder this myself, this really should be a standard by now.
hobofan•Feb 16, 2026
I can't speak for the status quo, but for at least the first ~5 years (so until 3 years ago when I last attempted to use it), the JS implementation of Fluent was a mess. Constant issues with incomplete API, wrong TS typings (which at that point were external) and build/bundling issues to the point where we opted for a homebrew solution.
I imagine that I probably wasn't the only one driven away by that (and I gave it many attempts!).
creshal•Feb 16, 2026
The standard is, for better or worse, gettext; it's good enough that any attempt to replace it runs into the problem that people can't agree on how much better an alternative needs to be to be worth migrating to; so you get a constant churn that so far hasn't seen any clear winner.
Cthulhu_•Feb 16, 2026
Feels like it's That XKCD page; there were standards like gettext, then web development came along and a load of people (...present company included) rediscovered localization and pluralization through trial, error, half-building one's own localization library, then the JS world reinvented it, etc etc etc.
zbraniecki•Feb 16, 2026
We are targeting MF2.0 for inclusion in JavaEcript stdlib (ECMA-402).
And later maybe with its own format into DOM for DOM L10n.
hobofan•Feb 16, 2026
They seems to be a strong overlap of people behind both projects, so that likely explains the similarities.
Wow, the in-browser preview is excellent. I first assumed it was just a demonstration and appreciated it very much, but then I realized it was live-editable and was blown away.
bmn__•Feb 16, 2026
Looking for an expert who knows both libintl/Gettext and MessageFormat.
What is the equivalent of xgettext.pl, the file extension for the main catalog file `.po`, the __ function?
How does gender work (small example)? How does layering pt_BR on pt_PT work?
The site behind that link gives answers to only 2 out of 6 question. If your goal was to promote and teach, then you have failed. If your goal was to demoralise the HN readers and grind the conversation to a stop, then you have succeeded.
jp1016•Feb 16, 2026
One practical thing I appreciated about MessageFormat is how it eliminates a bunch of conditional UI logic.
Which seems trivial in English, but gets messy once you support languages with multiple plural categories.
I wasn’t really aware of how nuanced plural rules are until I dug into ICU. The syntax looked intimidating at first, but it actually removes a lot of branching from application code.
I’ve been using an online ICU message editor (https://intlpull.com/tools/icu-message-editor) to experiment with plural/select cases and different locales helped me understand edge cases much faster than reading the spec alone.
Gettext has everything, it just takes knowing five languages to understand what to use for
Sharlin•Feb 16, 2026
Yeah, some sort of pluralization support is pretty much the second most important feature in any message localization tool, right after the ability to substitute externally-defined strings in the first place. Even in a monolingual application, spamming plural formatting logic in application code isn't exactly the best practice.
iririririr•Feb 16, 2026
gettext have everything, plus a huge ecosystem like tools to coordinate collaboration from thousand of contributors etc.
if alternatives don't start with a very strong case why gettext wasn't a good option, it's already a good indicator of not-invented-here syndrome.
moltonel•Feb 16, 2026
It's not hard to make a case against gettext, despite its maturity and large ecosystem.
IMHO pluralization is a prime example, with an API that only cleanly handles the English case, requires the developer to be aware of translation gotchas, and honnestly confusing documentation and format. Compare that to MessageFormat's pluralization example (https://github.com/unicode-org/message-format-wg/blob/main/s...) which is very easy to understand and fully in the translator's hands.
vsl•Feb 16, 2026
> IMHO pluralization is a prime example, with an API that only cleanly handles the English case
That’s not true at all? Gettext is functionally limited to source code being English (or alike). It handles all translation languages just fine, and competently so.
What is doesn’t have is MessageFormat’s gender selectors (useful) or formatting (arguably not really, strays from translations to locales and is better solvable with placeholders and locale-aware formatting code).
> fully in the translator's hands.
That is a problem that gettext doesn’t suffer from. You can’t reasonably expect translators to write correct DSL expressions.
moltonel•Feb 16, 2026
> Gettext is functionally limited to source code being English (or alike). It handles all translation languages just fine, and competently so.
The *ngettext() family of functions take two strings (typically singular/plural) and rely on a language-wide expression to choose the variant (possibly more than 2 variants). There's no good reason for taking two strings, this should be handled in the language file, even without a DSL. Ngettext handling a single countable makes some corner-cases awkward, like gendering a group with possibly mixed-gender elements. The Plural-Forms expression not being per-message means that for example even in English "none/one/many foo" has to be handled in code, and that a language with only a rare 3rd plural has to pay the complexity for all cases.
Arguably, those are all nitpicks, Gettext is adequate for most projects. But quality translations get cumbersome very quickly.
> You can’t reasonably expect translators to write correct DSL expressions.
This feels demeaning. Translators regularly have to check the source code, and often write templates, they're well able for a DSL like MessageFormat's, especially when it's always the same expressions for their language. It saves a trip to the bugtracker to get developers to massage their code into something translatable. You can't reasonably expect a English-speaking developer armed with ngettext to know (and prepare their code for) the subtleties of Gaelic numerals.
zbraniecki•Feb 16, 2026
No, gettext scales very badly, both vertically (larger systems) and horizontally (locales with rich grammatical forms like declensions etc.)
I checked the spec and don't get that really. Something should specify the formula for choosing the correct form (ie 1 for 21 in Slavic languages) and the format isnt any better compared to the gettext of 30 years ago
gcr•Feb 16, 2026
This confused me too but the formula and rules for variants are specified by the configured language out-of-band, so there is support for this.
Let's take your example. In English, counting files looks like this:
You have {file_count, plural,
=0 {no files}
one {1 file}
other {# files}
}
In Polish, there are several possible variants depending on the count:
Masz {file_count, plural,
one {# plik}
few {# pliki}
other {# pliko'w}
}
The library (and your translators) know that in Polish, the `few` variant kicks in when `i%10 = 2..4 && i%100 != 12..14`, etc. I think the library just knows these rules for each language as part of the standard. Mozilla says that it was an explicit design goal to put "variant selection logic in the hands of localizers rather than developers"
The point is that it's supported, it simplifies developer logic, and your translators know how to work with it.
(Apologies if I got the above translation strings wrong, I don't speak Polish. Just working from the GNU gettext example.)
npodbielski•Feb 16, 2026
usually it is ó instead of o' but otherwise very good :)
yorwba•Feb 16, 2026
"the library just knows these rules for each language as part of the standard" sounds great until you try to support a small minority language that the library just doesn't know about and then you're left trying to hack around it by pretending that it's actually a regional variety of another language with similar plural rules.
AFAIK, unlike gettext, MessageFormat doesn't allow you to specify a formula for the plural forms as part of the localization data, so the variant selection logic ended up in the hands of library developers rather than localizers or application developers.
And the standard does get updated occasionally, which can also lead to bugs with localization data written against another version of the standard: https://github.com/cakephp/cakephp/issues/18740
Muromec•Feb 16, 2026
>This confused me too but the formula and rules for variants are specified by the configured language out-of-band, so there is support for this.
Well, making out of band sure is one way to do to prevent lazy people from doing eval on plural forms from the po file. I hope the library is actually good then.
Seems like to get it right for every use case / language, you would need functions to translate phrases - so switch statements may be a valid solution. The number of text elements needed for pagination, CRUD operations and similiar UI elements should be finite :)
iririririr•Feb 16, 2026
that's a lazy feature. dealing with this on the front end is the right thing so you can have rich empty states anyway.
strogonoff•Feb 16, 2026
Does anyone know the ETA of MessageFormat 2.0? I am aware of the effort since pre-COVID times. I recall that some of the developers behind Mozilla Fluent have been among the people working on MF 2.0, and it’d be great to know whether Fluent and ICU MF are going to be interoperable in foreseeable future.
Vinnl•Feb 16, 2026
IIRC, the goal was for Fluent to have a convertor or something to be able to work with MessageFormat 2.0, but I don't quite remember where I heard that. My approach has just been to stick to Fluent for now.
zbraniecki•Feb 16, 2026
Yep. Mozilla is planning an auto converter from Fluent to MF2.0 once we stabilize it.
revetkn•Feb 16, 2026
My project Lokalized attempts to solve many of these complex plural/gender/ordinal/etc. rules with a tiny expression language:
Are there any formal test suites to check and compare the various localization libraries with each other? There's a lot of languages and language specific rules and exceptions to consider, after all.
Brosper•Feb 16, 2026
I discovered it working in https://tolgee.io but I am kind of surprised it boomed today :D
What I can say that it's a well-maintained format but also kinda hard to learn.
alexchamberlain•Feb 16, 2026
Apologies if this is obvious and I missed it. Does this define a way to store the strings in various languages?
Cthulhu_•Feb 16, 2026
I think this is just the format and specification itself, language selection and file storage and the like will depend on an implementing library. The i18next version for example (bizarrely) puts the whole string in a JSON key, but to be honest I think this is a bad example: https://github.com/i18next/i18next-icu?tab=readme-ov-file#mo...
I know these libs are primarily for devs to localize their apps but can they be used also with untrusted inputs, both message strings and vars?
ddevnyc•Feb 16, 2026
One thing I would really appreciate in this repository (and many like it) would be a simple, short, snippet of code that shows a typical use case of whatever the repo is selling me. Life's too short to dig around in the guts of the repository to find stuff like this out, it should be front and center. I want to know about the ergonomics and hackability of what I'm about to delve into.
11 Comments
Does anyone have reason for more optimism?
I don't see what's infeasible about it. It doesn't seem too different from .po files (gettext catalogs) meshed with hooks for post-processing as would see in e.g. a handlebars, both of which have individually found great adoption.
GP based his opinion on the assumption that this spec new and no implementations for it exist.
I mean they have hieroglyphs, some of which have plurals: https://www.unicode.org/charts/nameslist/n_13000.html
I wish it'll at least stay small: no nesting, no plugins, no looping, no operators, no side effects or calls to external functions (see Log4J).
Some languages have more variations. E.g. Czech, Slovene and Russian has 1, 2-4 and 5 as different cases.
Personally I think the syntax is too brittle. It looks too much like TeX code and it has the lisp like deal with lines ending with too many } braces.
I would separate it into two cases: simple strings with just simple interpolation and then a more fuller markup language, more like a simplified xml.
There are more example code at https://github.com/unicode-org/message-format-wg/blob/main/d...
However, ideally / in most cases it isn't.
https://projectfluent.org/
I wonder why it hasn't been adopted more widely.
I imagine that I probably wasn't the only one driven away by that (and I gave it many attempts!).
It seems the last edit of the page was in 2019, so I'm not sure how up to date it is.
[1] https://messageformat.unicode.org/
What is the equivalent of xgettext.pl, the file extension for the main catalog file `.po`, the __ function?
How does gender work (small example)? How does layering pt_BR on pt_PT work?
What is a compelling reason to switch?
Lmk if you have further questions!
I used to write switch/if blocks for:
• 0 rows → “No results” • 1 row → “1 result” • n rows → “{n} results”
Which seems trivial in English, but gets messy once you support languages with multiple plural categories.
I wasn’t really aware of how nuanced plural rules are until I dug into ICU. The syntax looked intimidating at first, but it actually removes a lot of branching from application code.
I’ve been using an online ICU message editor (https://intlpull.com/tools/icu-message-editor) to experiment with plural/select cases and different locales helped me understand edge cases much faster than reading the spec alone.
if alternatives don't start with a very strong case why gettext wasn't a good option, it's already a good indicator of not-invented-here syndrome.
IMHO pluralization is a prime example, with an API that only cleanly handles the English case, requires the developer to be aware of translation gotchas, and honnestly confusing documentation and format. Compare that to MessageFormat's pluralization example (https://github.com/unicode-org/message-format-wg/blob/main/s...) which is very easy to understand and fully in the translator's hands.
That’s not true at all? Gettext is functionally limited to source code being English (or alike). It handles all translation languages just fine, and competently so.
What is doesn’t have is MessageFormat’s gender selectors (useful) or formatting (arguably not really, strays from translations to locales and is better solvable with placeholders and locale-aware formatting code).
> fully in the translator's hands.
That is a problem that gettext doesn’t suffer from. You can’t reasonably expect translators to write correct DSL expressions.
The *ngettext() family of functions take two strings (typically singular/plural) and rely on a language-wide expression to choose the variant (possibly more than 2 variants). There's no good reason for taking two strings, this should be handled in the language file, even without a DSL. Ngettext handling a single countable makes some corner-cases awkward, like gendering a group with possibly mixed-gender elements. The Plural-Forms expression not being per-message means that for example even in English "none/one/many foo" has to be handled in code, and that a language with only a rare 3rd plural has to pay the complexity for all cases.
Arguably, those are all nitpicks, Gettext is adequate for most projects. But quality translations get cumbersome very quickly.
> You can’t reasonably expect translators to write correct DSL expressions.
This feels demeaning. Translators regularly have to check the source code, and often write templates, they're well able for a DSL like MessageFormat's, especially when it's always the same expressions for their language. It saves a trip to the bugtracker to get developers to massage their code into something translatable. You can't reasonably expect a English-speaking developer armed with ngettext to know (and prepare their code for) the subtleties of Gaelic numerals.
We (authors of Fluent and collaborators on MessageFormat 2.0) wrote this explainer which you may find informative - https://github.com/projectfluent/fluent/wiki/Fluent-vs-gette...
Let's take your example. In English, counting files looks like this:
In Polish, there are several possible variants depending on the count: Your Polish translators would write: The library (and your translators) know that in Polish, the `few` variant kicks in when `i%10 = 2..4 && i%100 != 12..14`, etc. I think the library just knows these rules for each language as part of the standard. Mozilla says that it was an explicit design goal to put "variant selection logic in the hands of localizers rather than developers"The point is that it's supported, it simplifies developer logic, and your translators know how to work with it.
See https://www.unicode.org/cldr/charts/48/supplemental/language...
(Apologies if I got the above translation strings wrong, I don't speak Polish. Just working from the GNU gettext example.)
AFAIK, unlike gettext, MessageFormat doesn't allow you to specify a formula for the plural forms as part of the localization data, so the variant selection logic ended up in the hands of library developers rather than localizers or application developers.
And the standard does get updated occasionally, which can also lead to bugs with localization data written against another version of the standard: https://github.com/cakephp/cakephp/issues/18740
Well, making out of band sure is one way to do to prevent lazy people from doing eval on plural forms from the po file. I hope the library is actually good then.
(Fluent informed much of the design of MessageFormat 2.)
I18n / l10n is full of things like this, important details that couldn’t be more boring or fiddly to implement.
How long till we just have a LLM do it on the fly?
Seems like to get it right for every use case / language, you would need functions to translate phrases - so switch statements may be a valid solution. The number of text elements needed for pagination, CRUD operations and similiar UI elements should be finite :)
https://lokalized.com
That being said your project looks very cool!
https://github.com/Frizlab/XibLoc/blob/e85a5179bdd93e0174731...
What I can say that it's a well-maintained format but also kinda hard to learn.