2026-04-20

redis pubsub limits × swap fan-out

spent today iterating on a real cpu problem around redis pubsub. context: we're data-intensive. ~600 swaps/s across every blockchain combined, each swap carrying 20–30 key/value fields. those swaps need to reach our api pods, but the apis can't compute stats themselves — way too cpu-intensive. so the flow is: publish → intermediate stats service → ram cache on each api pod, and the apis just serve from ram. two problems hit us at once. first, cpu on the pubsub side. we run 150+ api pods, plus other services that also subscribe. with 200+ subscribers total and around 2k messages/s being published, that's 300k+ messages/s on fan-out alone. some payloads are hundreds of kb, and that's just the main stats channel — there are other alternative channels running in parallel on the same master. the cpu was pinned. second, cpu on the api side. pubsub subscribers can't filter what they receive. each pod gets the full payload, runs a JSON.parse on hundreds of kb, and only then figures out whether any of it was even relevant. that turned into a hot path burning cpu on every pod, for data most of them didn't care about. what we couldn't do, and why: - couldn't just switch to per-token channels on the existing master. the clean version would be 1 channel per token with apis subscribing only to what they need, but that means 50k+ open channels at the same time, and redis pubsub channel bookkeeping is itself cpu-bound at that scale. the bottleneck just moves. - couldn't push filtering client-side either — once the master is already fanning out hundreds of kb per message across 200+ subscribers, the damage is already done upstream. what we ended up shipping: - sharded the pubsub. a new dedicated instance just for stats. pulling that traffic off the main pubsub cut its cpu load by roughly 50%. - replica reads for subscribers on the main pubsub. adapted the ioredis client so subscribers connect to replicas instead of all hammering the master. splits the subscriber load cleanly without changing the publish path. - per-token channels, but on their own separate master. now that the 50k-channels overhead is the only job on that instance, it's manageable. apis subscribe only to the tokens they actually need, and the JSON.parse storm on the api side mostly goes away. biggest takeaway: at this throughput there's no "one big pubsub" that works. sharding isn't really about redundancy here, it's about cpu budgeting per role. and any pattern that looks like "just subscribe per x" has a hidden cost inside redis internals that you only find at scale.

redis pubsub limits × swap fan-out

More in Xly log