The WarpStream MCP Server connects AI assistants like Claude and Cursor directly to your clusters. Query logs, inspect Diagnostics, and investigate ACL events without leaving your IDE. Just ask questions and let the AI handle the rest.
Audit Logs give you a complete, structured record of every authentication action, authorization decision, and platform operation across your WarpStream clusters.
We're excited to launch Lightning Topics and Ripcord Mode. Lightning Topics use delayed sequencing to achieve 33ms median produce latency with S3 Express One Zone — a 70% reduction. Ripcord Mode lets Agents continue processing writes during Control Plane outages.
WarpStream Tableflow is the simplest, easiest, and most cost-effective way of building data lakes out of Kafka topics. It does all of the tedious background jobs so that you don’t have to build and manage them yourself. Best of all, it plugs into whatever existing Kafka-compatible cluster that you already have, so there’s no upstream migration required to get up and running. It’s a complete data lake engine for Apache Iceberg. We like to think of it as the “bottom half” of the database, with your query engine of choice forming the top half.
Kafka ACLs (Access Control Lists) are essential for securing clusters, but enabling them in production clusters that already have traffic can be risky – misconfiguration or subtle syntax errors can block traffic and disrupt existing workloads. WarpStream’s ACL Shadowing solves this problem by evaluating ACLs on live traffic without enforcement, surfacing would-be denials through logs and Diagnostics. This enables operators to identify and fix issues safely, reducing surprises, and giving developers greater confidence when enabling ACLs.
WarpStream Tableflow's architecture takes inspiration from React's virtual DOM, enabling exactly-once ingestion and millions of Iceberg updates per second while ensuring data lake correctness.
By switching from Kafka to WarpStream for their logging workloads, Robinhood saved 45%. WarpStream auto-scaling always keeps clusters right-sized, and features like Agent Groups eliminate issues like noisy neighbors and complex networking like PrivateLink and VPC peering.
We originally launched our BYOC Schema Registry product with full support for Avro schemas. We’re excited to share that our Schema Registry now supports Protobuf schemas, with complete compatibility with Confluent’s Schema Registry.
Cursor’s AI-powered IDE fuses creativity and human intelligence. At its data-streaming heart is WarpStream, which makes it possible to train models securely, deliver lighting-fast Tab completions, and scale telemetry with zero ops.
With WarpStream and ClickHouse Cloud, what was once a patchwork of managed services and constant disk resizing has become a cohesive architecture that connects streaming and analytics in a single, resilient pipeline. Asked to sum up the stack in three words, Superwall co-founder and CTO Brian Anglin said, “Durable, scalable, powerful.”
We launched a new product called WarpStream Tableflow that is the easiest, cheapest, and most flexible way to convert Kafka topic data into Iceberg tables with low latency, and keep them compacted.
Goldsky’s mission to stream and index massive volumes of blockchain data quickly ran into the scaling and cost limits of traditional Kafka. With tens of thousands of partitions and petabytes of data, their clusters became expensive, fragile, and hard to operate. By migrating to WarpStream, Goldsky cut costs by over 10x, eliminated performance bottlenecks, and unlocked seamless scaling to 100 PiB and hundreds of thousands of partitions, all without the operational headaches of legacy Kafka deployments.
By combining WarpStream with Tigris, you get a bottomless, durable, and globally aware message queue. Your data is stored efficiently, close to where it’s consumed, without incurring hidden transfer fees or needing to plan for regional bucket placement.
With WarpStream Multi-Region Clusters, we can now ensure that you will also be protected from region-wide cloud provider outages, or single-region control plane failures.
By switching from open-source Apache Kafka to WarpStream, ShareChat was able to implement zero-ops auto-scaling and saved 60% vs. multi-AZ Kafka. They also shared some best practices for optimizing WarpStream.
Disaster recovery and data sharing between regions are intertwined. We explain how to handle them on Kafka and WarpStream, as well as talk about RPO=0 Active-Active Multi-Region clusters, a new product that ensures you don't lose a single byte if an entire region goes down.
The WarpStream team receives lots of questions about our architecture, pricing, unique features, and other aspects of WarpStream. We created this page to serve as an up-to-date repository or list of frequently asked questions.
Excited about the recent 85% drop in S3 Express One Zone (S3EOZ) prices? We've supported S3EOZ since December 2024. This latest pricing update means our latency is 3x better at only a 15% higher TCO.
Distributed systems built on object storage all have one common problem: removing files that have been logically deleted either due to data expiry or compaction. We review the pros and cons of five ways to solve this problem.
Pprof is an amazing tool for debugging memory leaks, but what about when it's not enough? Read about how we used gcore and viewcore to hunt a particularly nasty memory leak in a large distributed system.
Today, we're excited to announce WarpStream Schema Linking, a tool to continuously migrate any Confluent-compatible schema registry into a WarpStream BYOC Schema Registry. WarpStream now has a comprehensive Data Governance suite to handle schema needs.
We’ve released Diagnostics, a new feature for WarpStream clusters! Diagnostics continuously analyzes your clusters to identify potential problems, cost inefficiencies, and ways to make things better. It looks at the health and cost of your cluster and gives detailed explanations on how to fix and improve them.
"...we needed to ensure WarpStream could also support our scalability needs before settling on it. Given the size of Grafana Cloud Metrics, we knew it needed to handle read and write rates of tens of gigabytes per second."
Character.AI operates at scale, supporting over 20 million monthly active users across our services. Despite being a relatively small company, Character.AI services have significant and complex data storage needs.
In this blog post we'll explain how transactions work in Kafka by comparing and contrasting the implementations of transactions in two different Kafka implementations: the official Apache Kafka project, and WarpStream.
In this post, we’ll look at what noisy neighbors are, the current ways to handle them (cluster quotas and mirroring clusters), and how WarpStream’s solution compares in terms of elasticity, operational simplicity, and cost efficiency.
WarpStream BYOC reimplements the Kafka protocol with a stateless, zero-disk cloud-native architecture, replacing Kafka brokers with WarpStream Agents to simplify operations. But data streaming extends beyond Kafka clusters alone.
In this post, I’ll start off with a brief overview of “shared nothing” vs. “shared storage” architectures in general. This discussion will be a bit abstract and high-level, but the goal is to share with you some of the guiding philosophy that ultimately led to WarpStream’s architecture.
Orbit is a tool which creates identical, inexpensive, scaleable, and secure continuous replicas of Kafka clusters. It is built into WarpStream and works without any user intervention to create WarpStream replicas of any Apache Kafka-compatible source cluster.
WarpStream now supports AWS Glue Schema Registries, in addition to the Kafka-compatible schema registries. The WarpStream Agent can use schemas stored in the user’s AWS Glue Schema Registries to validate records.
Backpressure is a really simple concept. When the system is nearing overload, it should start “saying no” by slowing down or rejecting requests. Of course, the big question is: How do we know when we should reject a request?
Traditional offset-based monitoring can be misleading due to varying message sizes and consumption rates. To address this, you can introduce a time-based metric for a more accurate assessment of consumer group lag.
How we built support for running WarpStream's control plane and Metadata Store in multiple regions, while still presenting our platform as a single pane of glass.
WarpStream's Zero Disk / Diskless Architecture enables a BYOC deployment model that is secure by default and does not require any external access to the customer's environment.
Follow up to "Tiered Storage Won't Fix Kafka", this post covers all the different advantages that WarpStream's Zero Disk / Diskless Architecture provides over Apache Kafka.
Pixel Federation is the developer of nearly a dozen highly popular mobile games with players from all over the world. They have millions of monthly active users, and those millions of users generate lots of events. In fact, Pixel Federation uses an event-driven architecture for almost everything: logging, events, billing, tracking game state, etc. Find out how they saved 83% using WarpStream over MSK.
Managed Data Pipelines provide a fully-managed SaaS user experience for Bento, without sacrificing any of the cost benefits, data sovereignty, or deployment flexibility of the BYOC deployment model.
Tiered storage is a hot topic in the world of data streaming systems, and for good reason. Cloud disks are (really) expensive, object storage is cheap, and in most cases, live consumers are just reading the most recently written data. Paying for expensive cloud disks to store historical data isn’t cost-effective, so historical data should be moved (tiered) to object storage. On paper, it makes all the sense in the world.
This blog is guest authored by Fahad Shah from RisingWave, and cross-posted from RisingWave's blog. In this blog, we have presented the development of a real-time security threat monitoring system that integrates RisingWave, WarpStream, and Grafana. The setup process for the entire system is quite straightforward. To monitor each metric, you only need to create a single materialized view in RisingWave and visualize it in Grafana.
We’re excited to announce that WarpStream now natively embeds Bento, a stateless stream processing framework that connects to many data sources and sinks. Bento offers much of the functionality of Kafka Connect, as well as additional lightweight stream processing functions.
Many of today's most highly adopted open source “big data” infrastructure projects – like Cassandra, Kafka, Hadoop, etc. – follow a common story. A large company, startup or otherwise, faces a unique, high scale infrastructure challenge that's poorly supported by existing tools. They create an internal solution for their specific needs, and then later (kindly) open source it for the greater community to use. Now, even smaller startups can benefit from the work and expertise of these seasoned engineering teams. Great, right?
How we leverage Antithesis to deterministically simulate our entire SaaS platform and verify its correctness, all the way from signup to running entire Kafka workloads.
Benchmarking databases – and maintaining fairness and integrity while doing so – is a notoriously difficult task to get right, especially in the data streaming space. Vendors want their systems to produce mouth watering results, and so unnatural configurations divorced from customer realities (AKA “vanity” benchmarks) get tested, and it's ultimately the end-user that is left holding the bag when they realize that their actual TCO is a lot higher than they were led to believe.
A huge part of building a drop-in replacement for Apache Kafka® was implementing support for compacted topics. The primary difference between a “regular” topic in Kafka and a “compacted” topic is that Kafka will asynchronously delete records from compacted topics that are not the latest record for a specific key within a given partition.
Serverless products and usage based billing models go hand in hand, almost by definition. A product that is truly serverless effectively has to have usage based pricing, otherwise it’s not really serverless!
We first introduced WarpStream in our blog post: "Kafka is Dead, Long Live Kafka", but to summarize: WarpStream is a Kafka protocol compatible data streaming system built directly on top of object storage.
If you’re on a Data or Data Platforms team, you’ve probably already seen the productivity boost that comes from pulling business logic out of various ETL pipelines, queries, and scripts and centralizing it in SQL in a clean, version-controlled git repo managed by dbt. The engine that unlocked this approach is the analytical data warehouse: typically Snowflake or BigQuery.
Chances are you probably had a strong reaction to the title of this post. In our experience, Kafka is one of the most polarizing technologies in the data space. Some people hate it, some people swear by it, but almost every technology company uses it.