Keeping up with WarpStream

There’s been quite a few “Kafka alternatives” come onto the market, particularly in the past ~5 years. It makes sense; Kafka has had immense adoption, but it’s fair to say that ever since Day 1, it’s never been particularly easy. The ecosystem around Kafka is vast, and a huge percentage of that is just tooling that attempts to make it easier to manage (let alone be productive with). That is a pretty obvious opportunity.

To try and make it easier, there’s all kinds of managed Kafka platforms: Confluent, Aiven, Upstash, DoubleCloud, AWS MSK, Azure Event Hubs.

There’s also platforms that serve the same (or similar) purpose that aren’t based on Kafka: Apache Pulsar, StreamNative, Memphis.dev, JetStream, Fluvio, Redis Queues.

Not too long ago, Redpanda came along and rewrote Kafka from the ground up. Rather than Java, it was C++. Instead of ZooKeeper, it was Raft. And many more architectural changes that made it a very attractive modern replacement for Apache Kafka. It’s a great example of saying “This was great, but we could do it better today.”

But largely, all of these follow a very traditional platform architecture. They’re built expecting traditional computing primitives: servers, with CPUs and memory, and disks and sending data between machines on the networks. Duh, right? But a lot of today’s computing is done on the cloud, where you don’t have to build with those same primitives.

This is what I find particularly interesting about WarpStream; it’s being built from the ground up expecting today’s cloud primitives. Forget JBOD, data striping, partitioning, and tiered-storage - we’ve already got S3…and so on, challenging each decision on how to architect an event streaming platform for the cloud.

I like that. If there’s anything I can say about myself, it’s that I hate ‘tradition’. Challenging the assumptions of the past is always a good thing, it’s how we make progress. And as a data streaming nerd, it excites me to see that applied quite radically to an event streaming platform.