split illustration showing Mo struggling to carry a chaotic pile of uneven blocks on the left, contrasted with Mo calmly organizing blocks into neat, separate lanes on the right.

Disaggregated Inference, Part 1: When & Where to Route

Hien Luu Hien Luu

The Concurrency Cliff is a Memory Limit

Khawaja Shams headshot Khawaja Shams

GPUs are the most expensive resource in tech. We’re using them badly.

Stop CDN Leeching with Concurrency Tracking

What Hyperscale Caching Taught Us About GPU Utilization

Khawaja Shams headshot

Tooling is a Scaling Strategy

Understanding the NxM Problem in Distributed Caches

Why Large Cache Systems Need Routing Layers

Why Scaling Looks Different at Uber, Apple, and Mercado Libre

Reduce TTFT by >50% with LMCache + Momento

Khawaja Shams headshot
Daniela Miao headshot

Reduce TTFT by >50% with LMCache + Momento Accelerator

Khawaja Shams headshot

Performance Engineering Lessons from the Unlocked Conference

Mike Callahan Headshot

Large Objects Ruin the Party – Valkey 9 Tames Them

Khawaja Shams headshot

The Real Cost of Swapping Infrastructure

Breakthroughs Are Just Boring Improvements That Pile Up

Cache Rebalancing Was Broken. Here’s How Valkey 9.0 Fixed It

The Momento Platform

Khawaja Shams headshot
Daniela Miao headshot

Designing smarter caches with Valkey 9.0’s numbered databases

Cache It – Episode #7 – Valkey 9.0: Databases, Clustering, and Details with Kyle Davis

Khawaja Shams headshot

Valkey 9.0 – The Next Generation of Caching

Khawaja Shams headshot

The 5 Metrics that Predict Cache Outages

Daniela Miao headshot

The Latest Redis Vulnerability Exposes a Bigger Problem

Khawaja Shams headshot