WarpStream is a diskless, Apache Kafka®-compatible data streaming platform built directly on top of object storage: zero disks, zero inter-AZ costs, zero cross-account IAM access required. Scales infinitely and runs in your VPC.
Agents stream data directly to and from object storage with no buffering on local disks and no data tiering.
Create new “Virtual Clusters” in our control plane instantly. Auto-scaling agent binary.
Support different environments, teams, or projects without managing any dedicated infrastructure.
WarpStream's BYOC model lets you run a Kafka-compatible, diskless streaming platform directly on your own cloud infrastructure. Enjoy the benefits of self-hosting without the operational overhead.
WarpStream's hybrid (BYOC) deployment model provides the ease of use of a fully cloud hosted solution, but the cost profile, security, and data sovereignty of a self-hosted deployment.
BYOC clusters use your own compute, and your own object storage. Your data never leaves your environment.
ETL and stream processing from within your WarpStream Agents and cloud account. No additional infrastructure needed. You own the data pipeline – end to end. Raw data never leaves your account.
Store schemas in WarpStream’s Confluent-Compatible BYOC Schema Registry, verify data with Schema Validation, and migrate Confluent-compatible schema registries with Schema Linking.
Orbit allows you to automatically replicate topics (including record offsets), consumer groups, offset gaps, ACLs and cluster configurations. Works with any source system that is Apache Kafka protocol compatible.
The WarpStream Agents have native support for AWS S3, GCP GCS, and Azure Blob Storage built in, so they can run in all three of the major cloud providers.
The Agents can also leverage any S3-compatible object store, so they can be deployed on-prem using technology like MinIO, or in cloud environments with Cloudflare R2.
You can read more about our object storage support in our documentation.
Our hybrid deployment model means that raw data never leaves your environment.
The only data we receive is the high level metadata required to operate your control plane like topic names and partition counts. You can learn more about our zero access, secure by default model in this blog.
WarpStream supports Schema Validation via external schema registries and services like AWS Glue. It also has its own BYOC-native Schema Registry.
Like all Schema Registries, the WarpStream BYOC Schema Registry ensures data compatibility and compliance by validating schemas during data production and consumption. This helps minimize downstream data issues and enables schemas to evolve without breaking consumers.
In addition, it has unique features that are only possible with WarpStream’s stateless, zero-disk architecture, such as native integration with the WarpStream Agents, data retrieval via object storage with no intermediate disks, easy scaling, usage of zone-aware routing to avoid interzone networking fees, and no need to wait on “leaders” to be elected, as consensus is handled by WarpStream's metadata store, and Agents can read and write.
Learn more about WarpStream BYOC Schema Registry via our docs and schema-related features on our Data Governance page.
The real world is complex and actual cost savings will vary from workload to workload. However, unlike other data streaming systems, WarpStream was designed from the ground up around cloud economics.
WarpStream's unique architecture of leveraging object storage directly with no local disks or manual replication avoids almost all interzone networking costs, and ensures that data is always stored in the most cost effective way possible.
You can use our pricing calculator to compare WarpStream to OSS Kafka, MSK, and Kinesis.
Support for the full Kafka protocol is still a work in progress for us, but broadly it means that you should be able to take existing software like Spark, Flink, or custom applications that embed a Kafka client, change the Kafka URL to point at WarpStream, and have everything continue working as intended.
More specifically, this means that we support topics, partitions, consumer groups, transactions, etc. We also maintain all the same per-partition ordering guarantees that Kafka does, and even take things a step further by ensuring that producing data to any set of topics/partitions is fully atomic by default.