Zram Performance Analysis

burch45 · 2025-10-23T22:37:08 1761259028

This post’s conclusions are odd. It has a bunch of extensive benchmarks showing that zstd is by far the worst performing across every metric except a slight increase in compression ratio and then says the conclusion is zstd is the best choice. Unless I’m missing something in the data.

Dylan16807 · 2025-10-24T00:55:32 1761267332

In the first benchmark it gets a ratio of 4 instead of 2.7, fitting 36-40% more data with 75% more CPU. It looks great.

The next two show it fitting 20% more data with 2-3x the CPU, which is a tougher tradeoff but still useful in a lot of situations.

The rest of the post analyzes the CPU cost in more detail, so yeah it's worse in every subcategory of that. But the increase in compression ratio is quite valuable. The conclusion says it "provides the highest compression ratio while still maintaining acceptable speeds" and that's correct. If you care about compression ratio, strongly consider zstd.

buildbot · 2025-10-24T00:22:57 1761265377

I have had similar experience, with ZFS zstd dropped IOPs and throughput by 2-4x compared to lz4! On a 64 core Milan server chip…

colechristensen · 2025-10-24T01:33:28 1761269608

ZFS lz4 in my experience is faster in every metric than no compression.

Havoc · 2025-10-24T11:09:01 1761304141

Only if the data in question is at least somewhat compressible

colechristensen · 2025-10-24T15:40:36 1761320436

Not really, it goes so fast through the CPU that the disk speed is at worst the same and the CPU overhead is tiny (in other words it's not fast while saturating the CPU, it's fast while consuming a couple percent of the CPU)

technically sure you're correct but the actual overhead of lz4 was more or less at the noise floor of other things going on on the system to the extent that I think lz4 without thought or analysis is the best advice always.

Unless you have a really specialized use case the additional compression from other algorithms isn't at all worth the performance penalty in my opinion.

1oooqooq · 2025-10-24T00:20:42 1761265242

the context is missing.

but for vps, where the cpu usage is extremely low and ram is expensive, it might make sense to sacrifice a little performance for more db cache maybe. can't say without more context

kragen · 2025-10-23T22:15:34 1761257734

An alternative is zswap https://old.reddit.com/r/linux/comments/11dkhz7/zswap_vs_zra... which I believe, despite the name, can also compress RAM without hitting disk.

mscdex · 2025-10-23T22:46:43 1761259603

It's only an alternative if you have a backing swap device. zram does not have this requirement, so (aside from using no compression) it's basically the only solution for some scenarios (e.g. using entire disk(s) for ZFS).

kragen · 2025-10-23T23:33:45 1761262425

Can't you use a ramdisk as your backing swap device?

PhageGenerator · 2025-10-24T01:05:08 1761267908

Using a ramdisk for zswap is basically just zram with extra steps.

Ferret7446 · 2025-10-24T02:17:24 1761272244

It is not the same at all. The swapping algorithm can make a big difference in performance, for better or worse depending on workload

RealStickman_ · 2025-10-24T09:57:28 1761299848

Zram is just swap but in RAM. It uses the same algorithms as normal swap

kragen · 2025-10-24T01:13:06 1761268386

Extra steps are fine if the result works better.

heavyset_go · 2025-10-23T23:12:32 1761261152

If you use hibernation, I think it also compresses your RAM image for potentially less wear and faster loading/saving

1oooqooq · 2025-10-24T00:22:15 1761265335

why hibernation would not compress to begin with? you're more likely just end up running zstd twice.

heavyset_go · 2025-10-24T01:00:15 1761267615

Swap isn't compressed by default, hibernation dumps memory to swap

kasabali · 2025-10-24T06:40:12 1761288012

Hibernation uses compression regardless of zswap

heavyset_go · 2025-10-24T08:11:28 1761293488

Thanks for the correction

coppsilgold · 2025-10-24T02:19:09 1761272349

zram tends to change the calculus of how to setup the memory behavior of your kernel.

On a system with integrated graphics and 8 (16 logical) cores and 32 GB of system memory I achieve what appears to be optimal performance using:

    zramen --algorithm zstd --size 200 --priority 100 --max-size 131072 make
    sysctl vm.swappiness=180
    sysctl vm.page-cluster=0
    sysctl vm.vfs_cache_pressure=200
    sysctl vm.dirty_background_ratio=1
    sysctl vm.dirty_ratio=2
    sysctl vm.watermark_boost_factor=0
    sysctl vm.watermark_scale_factor=125
    sysctl kernel.nmi_watchdog=0
    sysctl vm.min_free_kbytes=150000
    sysctl vm.dirty_expire_centisecs=1500
    sysctl vm.dirty_writeback_centisecs=1500

Compression factor tends to stay above 3.0. At very little cost I more than doubled my effective system memory. If an individual workload uses a significant fraction of system memory at once complications may arise.

avidiax · 2025-10-24T06:32:54 1761287574

This seems like a great place to ask: how does one go about optimizing something like zram, which has a tremendous number of parameters [1]?

I had considered some kind of test where each parameter is perturbed a bit in sequence, so that you get an estimate of a point partial derivative. You would then do an iterative hill climb. That probably won't work well in my case since the devices I'm optimizing have too much variance to give a clear signal on benchmarks of a reasonable duration.

[1] https://docs.kernel.org/admin-guide/sysctl/vm.html

hdjfjkremmr · 2025-10-24T07:13:21 1761290001

optuna, probably coupled with a VM to automate testing

sirfz · 2025-10-23T22:35:50 1761258950

a comment here about zram caught my eye a day or two ago and I've been meaning to look into it. Glad to see this post (and I'm sure many others saw the same comment and shared my obsession)

dfc · 2025-10-24T03:06:15 1761275175

You saw a comment a day or two ago about zram, but never got around to looking into it more even though you are obsessed by it?

gatane · 2025-10-24T02:11:40 1761271900

Just I was trying to find a benchmark about this, I wondered which algorithm would work best for videogames. Thanks!

dandanua · 2025-10-24T07:00:33 1761289233

Video games and compute heavy tasks cannot have a large compression factor. The good thing is that you can test your own setup using zramctl.

pengaru · 2025-10-24T04:51:28 1761281488

LZ4 looks like the sweet spot to me, you get OK compression and the performance hit is minimal.

masklinn · 2025-10-24T05:35:06 1761284106

As all tradeoffs it depends on your requirements. lz4 is ridiculously fast so it essentially gets you more ram for free, zstd is a lot more CPU-intensive but also has a much higher compression ratio. So if your RAM is severely undersized for some of your workloads and / or you're not especially CPU-bound until disk swap takes you out, then zstd gives you a lot more headroom.

jftuga · 2025-10-24T01:28:00 1761269280

Has anyone tried using zram inside of various K8s pods? If so, I'd be interested in knowing the outcome.

asgeirn · 2025-10-24T05:51:30 1761285090

Inside the pods it makes no sense, but I do enable it on some memory-constrained worker nodes. Note that the kubelet by default refuses to start if the machine has any swap at all.

Szpadel · 2025-10-24T15:43:34 1761320614

you have have multiple layers of compression, but you need some simple Daemon (basically for loop in bash)

I use lz4-rle as first layer, but if page is idle for 1h it is recompressed using zstd lvl 22 in the background

it is great balance, for responsiveness Vs compression ratio

flaboonka · 2025-10-25T17:02:52 1761411772

This sounds interesting. Do you have a link to the source for this daemon?

Szpadel · 2025-10-26T15:40:57 1761493257

Sure I created instructions here:

https://gist.github.com/Szpadel/9a1960e52121e798a240a9b320ec...

flaboonka · 2025-10-26T16:28:54 1761496134

This is ingenious, thank you.