WarpStream is protocol compatible with Apache Kafka®, so you can keep using all your favorite tools and software.
No need to rewrite your application or use a proprietary SDK. Just change the URL in your favorite Kafka client library and start streaming!
WarpStream's hybrid Bring Your Own Cloud (BYOC) deployment model provides the ease of use of a fully cloud hosted solution, but the cost profile, security and data sovereignty of a self-hosted deployment.
BYOC clusters use your own compute, and your own object storage. Your data never leaves your environment.
Don't see an answer to your question? Check our docs, or contact us directly.
The real world is complex and actual cost savings will vary from workload to workload. However, unlike other data streaming systems, WarpStream was designed from the ground up around cloud economics.
WarpStream's unique architecture of leveraging object storage directly with no local disks or manual replication avoids almost all interzone networking costs, and ensures that data is always stored in the most cost effective way possible.
The WarpStream Agents have native support for AWS S3, GCP GCS and Azure Blob Storage built in, so they can run in all three of the major cloud providers.
The Agents can also leverage any S3-compatible object store, so they can be deployed on-prem using technology like MinIO, or in cloud environments with Cloudflare R2.
You can read more about our object storage support in our documentation.
Support for the full Kafka protocol is still a work in progress for us, but broadly it means that you should be able to take existing software like Spark, Flink, or custom applications that embed a Kafka client, change the Kafka URL to point at WarpStream, and have everything continue working as intended.
More specifically, this means that we support topics, partitions, consumer groups, etc. We also maintain all the same per-partition ordering guarantees that Kafka does, and even take things a step further by ensuring that producing data to any set of topics/partitions is fully atomic by default.
Our hybrid deployment model means that raw data never leaves your environment.
The only data we receive is the high level metadata required to operate your control plane like topic names and partition counts.