How Uber uses Apache Kafka in production

factor-houseMay 30, 202616 min read

Uber’s Kafka deployment is one of the most extensively documented in the industry, with the company publishing more than a dozen engineering blog posts and conference talks describing how its Streaming Platform team has evolved the infrastructure over a decade. By 2021, Uber was processing trillions of messages per day across tens of thousands of topics, with throughput that had grown from roughly one million to twelve million messages per second over five years. Apache Kafka sits at the centre of nearly every real-time system Uber operates, from matching riders with drivers to billing advertisers to detecting fraud.

Company overview

Uber operates a global ride-hailing, food delivery, and freight platform across more than 70 countries. At peak, thousands of trips are being coordinated simultaneously, each generating a continuous stream of GPS updates, payment events, and state changes from driver and rider applications. That volume, combined with strict latency requirements for matching and pricing, pushed Uber toward an event-driven architecture early in its growth.

Kafka was adopted in early 2015, beginning with a small cluster in a single region. Within a year, the platform was auditing approximately one trillion messages per day. By 2021, that figure had grown to trillions, and the team had built a suite of internal tools - uReplicator, Chaperone, uForwarder, and uGroup - on top of it. The timeline below traces the major milestones.

Timeline

Date	Milestone
Early 2015	Kafka adopted; small cluster in one region
November 2015	uReplicator deployed to production
January 2016	Chaperone auditing approximately 1 trillion messages per day
August 2016	uReplicator open-sourced
December 2016	Chaperone blog post published
June 2017	Hadoop Summit: real-time infrastructure scaled to trillions of events per day
February 2018	Dead letter queue / reliable reprocessing blog post published
March 2019	DBEvents CDC framework blog post published
April 2019	Kafka Summit SF: Kafka Cluster Federation and multi-region disaster recovery
January 2020	Kappa architecture blog post published
December 2020	Multi-region Kafka disaster recovery blog post published
August 2021	Consumer Proxy blog published; 200,000 partitions, 12 million messages per second
September 2021	Exactly-once ad event processing blog published
October 2021	uGroup consumer management framework blog published
2022-2023	Kafka tiered storage deployed to production workloads
April 2024	Kafka Summit London: Exactly-Once Stream Processing at Scale at Uber
July 2024	Kafka tiered storage blog published
February 2026	uForwarder open-sourced; 1,000+ consumer services onboarded

Uber’s Kafka use cases

Kafka underpins real-time operations across Uber’s core products and its internal data platform. The use cases below span multiple engineering teams and reflect how the system has expanded from transport to advertising and insurance.

Rider-driver matching and surge pricing

GPS location events from rider and driver applications flow into Kafka, where stream processing jobs analyse supply and demand in near real-time. The same event streams power dynamic pricing, which updates fares every few seconds. UberEats ETAs are also calculated from Kafka-sourced event streams. These pipelines operate with strict latency targets: the platform targets API latency below 5ms and 99.99% availability.

Fraud detection

Streaming analysis of transaction and session events runs continuously to identify fraud patterns. Bot detection and rider session analytics feed into similar pipelines.

Change data capture

The DBEvents framework reads MySQL binary logs via an internal tool called StorageTapper and streams change events into Kafka. Cassandra CDC events follow the same path. Downstream, these events land in Uber’s Hadoop data lake via Apache Hudi, which supports upsert-capable writes for incremental updates.

Ad event processing

Impressions and clicks from UberEats advertising flow through a Kafka pipeline that handles ad pacing, budget management, and customer billing. This pipeline operates with exactly-once guarantees, described in more detail under Special techniques below.

Async microservice queuing

More than 1,000 internal consumer services use Kafka as an asynchronous queue via the uForwarder Consumer Proxy rather than consuming directly from brokers. The proxy handles offset management and delivery, shielding application teams from Kafka consumer group mechanics.

Dead letter queues and retry pipelines

The Driver Injury Protection insurance product deducts per-mile premiums on each trip. When payment processing fails, events route through tiered retry topics with increasing delays before landing in a dead letter queue. Independent consumer groups maintain retry pipelines for each failure tier.

Data archival and real-time analytics

Kafka events sink to Apache Pinot for real-time OLAP queries and to Apache Hive for warehouse analysis. This combination gives Uber both low-latency query access to recent data and long-term historical analysis from the same event stream.

Kappa architecture backfill

Streaming and batch jobs share a single codebase. Rather than replaying historical data into Kafka for backfill runs, Uber models Apache Hive as a streaming source in Spark Structured Streaming, allowing the same job logic to run against historical data without reloading it into the cluster.

Scale and throughput

Metric	Value	Source
Messages per day	Trillions (multiple petabytes)	Uber Engineering Blog, 2020-2021
Messages per second (2021)	12 million	Consumer Proxy blog, August 2021
Messages per second (baseline)	1 million	Consumer Proxy blog (5 years prior)
Topics	Tens of thousands	Hadoop Summit 2017
Topics audited by Chaperone	20,000+	Hadoop Summit 2017
Partitions	200,000	Consumer Proxy blog, August 2021
Consumer services on uForwarder	1,000+	uForwarder blog, February 2026
Clusters	Dozens, across multiple regions	Kafka Cluster Federation talk, 2019
API latency target	Less than 5ms	Hadoop Summit 2017
Availability target	99.99%	Hadoop Summit 2017

‍

Throughput grew twelve-fold over five years. Much of that growth was absorbed by the Consumer Proxy rather than by adding partitions, since the proxy decouples partition count from consumer service concurrency.

Uber’s Kafka architecture

Cluster topology

Uber operates two tiers of Kafka clusters across multiple regions. Regional Clusters receive all producer traffic; producers always publish to their local region. Aggregate Clusters hold cross-region replicas and provide a unified global view of each topic. Replication between tiers is handled by uReplicator, Uber’s in-house replacement for Kafka MirrorMaker.

In addition to the regional/aggregate split, clusters are segmented by use case: separate clusters exist for logging, database changelogs, and high-reliability messaging. This isolation contains blast radius when one cluster has problems and allows retention and replication policies to be tuned per use case.

A Kafka Cluster Federation layer presents multiple physical clusters as a single logical cluster. A Kafka Proxy handles metadata routing, and a central Metadata Service maintains the registry of which topics live on which physical clusters.

Producer architecture

Producers publish through a Local Agent, a persistence layer that buffers messages on the producer side before writing to brokers. This improves durability by ensuring messages survive transient broker unavailability without being dropped at the client. Multi-language producer support covers Java, Go, Python, Node.js, and C++. An internal Kafka REST Proxy provides an HTTP interface for non-JVM producers; through internal optimisation work, its throughput was raised from 7,000 to 45,000 QPS per box.

Consumer architecture

Rather than having each microservice manage its own Kafka consumer group, Uber routes most consumer traffic through uForwarder, a push-based Consumer Proxy. uForwarder fetches messages from Kafka partitions and delivers them to consumer service endpoints via gRPC. This design separates partition count from processing concurrency: a consumer service can scale its processing threads independently of how many Kafka partitions the topic has. Offset commits are managed centrally by the proxy rather than by individual service instances.

uForwarder also handles out-of-order offset commits. Rather than blocking a partition on a slow message, it acknowledges individual messages and commits only contiguous completed ranges. This allows parallel processing within a partition without the risk of losing acknowledgement state.

Stream processing

Apache Flink is the primary stream processing engine for stateful workloads, including fraud detection and ad event aggregation. Apache Samza runs alongside Flink for some pipelines. Apache Spark Structured Streaming is used for the Kappa architecture backfill jobs described above.

Kafka Connect ecosystem

CDC ingestion uses StorageTapper, Uber’s internal MySQL binlog reader, to produce change events into Kafka. Downstream, sinks connect to Apache Hive, HDFS, and Apache Pinot. The Chaperone audit system also consumes every Kafka message as part of the observability pipeline.

Special techniques and engineering innovations

uReplicator: custom cross-cluster replication

Uber replaced Kafka MirrorMaker with uReplicator after MirrorMaker caused weekly production outages. The root cause was consumer group rebalancing: whenever a topic was added or changed, MirrorMaker would pause replication for five to ten minutes while rebalancing completed. uReplicator addresses this with Apache Helix for static partition assignment and a DynamicKafkaConsumer that eliminates rebalance-triggered pauses. New topics can be added to replication at runtime without restarting the cluster. uReplicator also applies header-based filters during replication to prevent cyclic data duplication across federated clusters.

Exactly-once ad event processing

Uber’s advertising billing pipeline uses a combination of Flink transactional producers, Kafka read_committed consumer isolation, two-minute checkpoint intervals, and per-record UUIDs to achieve end-to-end exactly-once delivery. Deduplication at the sink layer uses Apache Pinot’s native upsert capability and Hive keyed on the same UUIDs. This pipeline was built specifically to avoid double-counting billable advertising events.

Kafka tiered storage

Uber implemented Kafka tiered storage using the KIP-405 pluggable storage interface. Recent log segments remain on broker disk for low-latency access. Older segments are offloaded to remote storage (HDFS, S3, GCS, or Azure) transparently, without the consumer needing to know which tier a message is fetched from. This decouples storage retention from broker capacity: longer retention periods no longer require adding brokers or running separate data pipelines to external storage.

Kappa architecture backfill via Hive as a streaming source

When a streaming job needs to backfill historical data, replaying that data into Kafka adds significant cluster load and can disrupt real-time consumers. Uber avoids this by modelling Apache Hive as a streaming source in Spark Structured Streaming. The same job code handles both real-time Kafka consumption and historical Hive reads, and windowing semantics are preserved across the two modes.

Dead letter queue topology

Uber’s dead letter queue implementation uses a multi-tier retry topology: a main consumption topic feeds into retry topics with increasing delay intervals, and messages that exhaust retries land in a dead letter topic. Each tier uses Avro schemas and a leaky bucket pattern for flow control. Multiple independent consumer groups can maintain their own retry pipelines against the same underlying topics.

uGroup: consumer group visibility via __consumer_offsets decoding

Standard Kafka consumer group monitoring relies on active consumers reporting metrics. This misses consumer groups that are stopped or failing silently. uGroup is a streaming job that decodes Kafka’s internal __consumer_offsets topic directly, making all consumer group activity visible regardless of consumer state. It also tracks offset state across regions to support disaster recovery failover.

Operating Kafka at scale

End-to-end auditing with Chaperone

Chaperone is Uber’s audit system for Kafka pipelines. It consumes every message across all topics and uses ten-minute tumbling windows to compute count, p99 latency, and duplication metrics at four points in the pipeline: the proxy client, the proxy server, regional brokers, and aggregate brokers. It audits more than 20,000 topics and uses write-ahead logging and UUIDs to ensure that audit records themselves are written with exactly-once semantics. When a discrepancy appears, operators can pinpoint which pipeline tier introduced data loss or duplication.

Consumer group observability with uGroup

uGroup emits lag metrics and stuck-partition alerts for all consumer groups, including those that are not currently running. During a multi-region failover, uGroup provides the offset state mapping needed for consumers to resume from the correct position in the target region.

Schema governance with Schema-Service and Heatpipe

Uber maintains an in-house schema registry called Schema-Service that enforces backward-compatible Avro schema evolution. The Heatpipe library, used by producers, validates messages against the registered schema at ingestion time. This prevents malformed or schema-incompatible data from entering Kafka pipelines.

Topic ownership enforcement

Uber requires ownership metadata at topic creation. Automated tooling infers ownership where possible. This is part of a broader data culture initiative to ensure every dataset - including Kafka topics - has an identifiable owner who can be contacted during an incident.

Centralised upgrades via Consumer Proxy

Before uForwarder, each of Uber’s 1,000+ consumer services maintained its own Kafka client library, often across multiple languages. Upgrading Kafka client versions required coordinating changes across hundreds of services. With uForwarder, the Kafka consumer implementation is centralised in the proxy. Client upgrades happen in the proxy without requiring changes to the services it serves.

Deployment

Uber runs Kafka as a self-managed deployment across its own infrastructure in multiple geographic regions.

Challenges and how they solved them

Challenge	Solution	Outcome
Kafka MirrorMaker caused weekly outages due to rebalancing delays of five to ten minutes	Built uReplicator with Apache Helix for static partition assignment and a DynamicKafkaConsumer	Rebalancing delays eliminated; dynamic topic whitelisting without cluster restart
Scaling partitions: 1 msg/sec throughput required one partition per consumer thread, which would require millions of partitions across 1,000+ services	Consumer Proxy multiplexes a single partition across many service instances via gRPC	Throughput scaled from 1 million to 12 million messages per second without proportional partition growth
Head-of-line blocking from slow or poisoned messages halted entire partitions	Consumer Proxy detects stuck consumers and routes problem messages to DLQ; uForwarder adds active head-of-line blocking resolution	Individual slow messages no longer stall partition processing
Backfilling streaming jobs required replaying data into Kafka, creating significant cluster load	Kappa architecture models Hive as a streaming source in Spark Structured Streaming	Historical backfill runs without additional Kafka load
Ad billing double-counting risk from at-least-once delivery	Flink transactions, Kafka read_committed, per-record UUIDs, and Pinot upserts	End-to-end exactly-once delivery for billing events
Storage and compute were coupled: longer retention required adding broker capacity	Kafka tiered storage (KIP-405) offloads old segments to S3, GCS, or HDFS	Retention extended without adding brokers
Kafka REST Proxy throughput was insufficient at 7,000 QPS per box	Internal performance optimisations	Throughput raised to 45,000 QPS per box
No visibility into consumer groups that weren’t actively running	uGroup decodes __consumer_offsets directly	Full consumer group visibility including offline consumers
Offset mapping during multi-region failover with out-of-order replication	Active/Passive mode with periodic cross-region offset synchronisation and offset mapping tables	Consumers can resume from the correct position after a regional failover

Full tech stack

Category	Tools	Notes
Message broker	Apache Kafka	Self-managed, multi-region
Schema registry	Schema-Service (internal) + Heatpipe	Avro backward compatibility enforced at ingestion
Stream processing	Apache Flink, Apache Samza, Apache Spark Structured Streaming	Flink primary for stateful workloads
Cross-cluster replication	uReplicator (open-source, Uber-built)	Replaces MirrorMaker; Apache Helix for partition assignment
Consumer proxy	uForwarder (open-source, Uber-built)	Push-based; gRPC delivery; 1,000+ services
Connectors / CDC	StorageTapper (MySQL binlog reader, internal)	CDC to Kafka; downstream to Hive and Pinot
Real-time OLAP	Apache Pinot	Native upsert for exactly-once deduplication
Data lake / warehouse	Apache Hadoop, HDFS, Apache Hive	CDC and event archival; also used as streaming source
Upsert storage	Apache Hudi	Incremental CDC updates on HDFS
Tiered storage backends	HDFS, Amazon S3, Google Cloud Storage, Azure	KIP-405 pluggable interface
Monitoring / auditing	Chaperone (internal), uGroup (internal)	End-to-end audit; consumer group lag and DR offset tracking
Cluster coordination	Apache Helix, Apache ZooKeeper	uReplicator partition assignment; cluster state
Serialisation	Apache Avro	Enforced by Heatpipe + Schema-Service
Transport (Consumer Proxy)	gRPC / Protobuf	uForwarder to consumer service endpoints
HTTP producer interface	Kafka REST Proxy (internal)	Optimised to 45,000 QPS per box
Languages	Java, Go, Python, Node.js, C++	Multi-language producer support

Key contributors

Name	Role	Contribution
Chinmay Soman	Software Engineer, Streaming Platform	Led uReplicator design; authored the uReplicator blog post
Yuanchi Ning, Xiang Fu, Hongliang Xu	Streaming Platform engineers	Co-built uReplicator
Xiaobing Li	Software Engineer, Core Infrastructure	Co-authored Chaperone blog post
Ankur Bansal	Senior Software Engineer, Streaming Team	Co-authored Chaperone; presented at Hadoop Summit 2017
Mingmin Chen	Director of Engineering, SSD Team	Hadoop Summit 2017; co-authored DR blog; uGroup
Yupeng Fu	Principal Software Engineer, SSD/Streaming Team	Disaster recovery blog; uGroup; ad events; Kafka Cluster Federation talk
Xiaoman Dong	Senior Software Engineer, Streaming Data	Kafka Cluster Federation talk; uGroup
Ovais Tariq	Sr. Manager, Core Storage	Led DBEvents CDC framework
Amey Chaugule	Senior Software Engineer, Marketplace Experimentation	Authored Kappa Architecture blog
Qichao Chu, George Teo, Haitao Zhang, Zhifeng Chen	Streaming Data Team	Co-authored Consumer Proxy blog; led uForwarder
Jacob Tsafatinos, Yuriy Bondaruk, Yupeng Fu, James Kwon	Ads Platform / Ads Billing	Co-authored exactly-once ad events blog
Ning Xia	Software Engineer, Payments Team	Authored dead letter queue blog
Abhijeet Kumar, Kamal Chandraprakash, Satish Duggana	Kafka Team	Co-authored Kafka tiered storage blog; Satish Duggana is an Apache Kafka committer and PMC member
Roshan Naik, Si Lao	Uber Engineering	Presented Exactly-Once Stream Processing at Scale at Kafka Summit London 2024

Key takeaways for your own Kafka implementation

Replication tooling matters at scale. MirrorMaker’s consumer group rebalancing caused weekly outages for Uber. If you are replicating across clusters or regions, understand how your replication tool handles topic changes and partition rebalancing before it becomes a production problem.
Partition count and consumer concurrency are separate concerns. Uber’s Consumer Proxy approach shows that you do not have to create more partitions to scale consumer throughput. A proxy or multiplexing layer can decouple the two, which is relevant if you have many small consumers or need to avoid partition-count overhead.
Exactly-once requires a coordinated strategy across producers, consumers, and sinks. Uber’s ad billing pipeline combines Flink transactions, Kafka read_committed isolation, and per-record UUIDs with Pinot upserts. No single piece provides the guarantee on its own. If you need exactly-once for a high-stakes pipeline, plan the deduplication strategy at every tier from the start.
Tiered storage changes the broker-sizing conversation. Decoupling log retention from broker disk means you can extend retention without adding brokers. If your current retention policy is constrained by storage cost, tiered storage is worth evaluating before the next round of broker capacity planning.
Centralising consumer infrastructure simplifies client upgrades. Coordinating a Kafka client upgrade across hundreds of services in multiple languages is operationally expensive. If you are managing many consumer services, a shared consumer proxy or library layer reduces the coordination overhead significantly.

Sources and further reading

#	Source
1	Chinmay Soman et al. - uReplicator: Uber Engineering’s Robust Apache Kafka Replicator - Uber Engineering Blog, 2016-08-04
2	Xiaobing Li, Ankur Bansal - Chaperone: Audit Apache Kafka End-to-End - Uber Engineering Blog, 2016-12-08
3	Ankur Bansal, Mingmin Chen - How Uber Scaled Its Real-Time Infrastructure to Trillion Events Per Day - Hadoop Summit, 2017-06-14
4	Ovais Tariq, Nishith Agarwal - DBEvents: Uber’s Ingestion Framework for Database Events - Uber Engineering Blog, 2019-03-14
5	Yupeng Fu, Xiaoman Dong - Kafka Cluster Federation at Uber - Kafka Summit SF 2019
6	Amey Chaugule - Kappa Architecture: Uber’s Approach to Unifying Streaming and Batch - Uber Engineering Blog, 2020-01-23
7	Yupeng Fu, Mingmin Chen - Disaster Recovery for Multi-Region Kafka at Uber - Uber Engineering Blog, 2020-12-21
8	Qichao Chu, George Teo, Haitao Zhang, Zhifeng Chen - Kafka Async Queuing with Consumer Proxy - Uber Engineering Blog, 2021-08-31
9	Jacob Tsafatinos et al. - Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot - Uber Engineering Blog, 2021-09-23
10	Qichao Chu et al. - Introducing uGroup: Uber’s Consumer Management Framework - Uber Engineering Blog, 2021-10-21
11	Abhijeet Kumar et al. - Kafka Tiered Storage at Uber - Uber Engineering Blog, 2024-07-01
12	Zhifeng Chen, Haifeng Chen - Introducing uForwarder - Uber Engineering Blog, 2026-02-05
13	Roshan Naik, Si Lao - Exactly-Once Stream Processing at Scale in Uber - Kafka Summit London 2024
14	Ning Xia - Reliable Reprocessing: Dead Letter Queues for Apache Kafka - Uber Engineering Blog, 2018-02-16

‍

If you want to explore your own Kafka topics and consumer groups with the kind of visibility Uber has built internally, Kpow offers a free 30-day trial and connects to any Kafka cluster in minutes.