Compact Representations for Arrays in Lua [pdf]

64 points 9 comments 4 days ago
marhee

I wonder, in reality, if a Lua program uses large (consecutive) arrays, its values will likely have the same type? At the very least it is a common use-case: large arrays of only strings, numbers etc. Wouldn’t it make sense to (also) optimize just for this case with a flag and a single type tag. Simple and it optimizes memory use for 98% of use cases?

tedunangst

This seems likely to create some inexplicable performance elbows where you have 1000 strings, but there's one code path that replaces one with a number, and now the whole array needs to be copied. Tracking that down won't be fun.

Jyaif

It makes a lot of sense, and but then you have two code paths for tables.

The Lua folks want a simple codebase, so they (knowingly) leave a lot of performance on the table in favor of simplicity.

ufo

This optimization might land in the next Lua release. More specifically, the "Reflected Arrays" version (Figure 6).

https://github.com/lua/lua/blob/f71156744851701b5d5fabdda506...

kzrdude

It was published in September 2024, so it's relatively recent.

Jyaif

Jesus christ, 40% waste in arrays that can be solved by using `__attribute__((packed))`.

Irresponsible of them of not advertising this as an option in luaconf.h

sfpotter

Here's the rest of that paragraph for you:

"However, this attribute is a gcc extension not present in ISO C. Moreover, even in gcc it is not guaranteed to work [3]. As portability is a hallmark of Lua, this almost magical solution is a no-go."

ethan_smith

`__attribute__((packed))` wouldn't help here since the issue is about Lua's array/hash hybrid table design and memory allocation strategy, not C struct padding.

lifthrasiir

But it did help in the other way, in my reading of the paper [1]. So the OP is asking why this is not even an option on supported environments, and I too think that this is indeed a good question to ask.

[1] "Hugo Gualandi reported that just adding the gcc attribute __attribute__((packed)) to the definition of the structure TValue reduces its size from 16 to 9 bytes, without any sensible difference in performance."

Made by @calebRussel