Episode 163:

Simplifying Real-Time Streaming with David Yaffe and Johnny Graettinger of Estuary

November 8, 2023

This week on The Data Stack Show, Eric and Kostas chat with David Yaffe and Johnny Graettinger, Co-Founders of Estuary, a company building the next generation of real-time data integration solutions. During the episode, David and Johnny discuss streaming technology, the challenges of real-time streaming, the importance of low latency and high-scale data processing in the ad tech industry. They delve into the reasons behind building Estuary, its unique approach to decoupling storage and computing, and its focus on real-time updates and scalability. They also touch on the complexities of state management in streaming applications, data mesh, the impact of streaming on data processing and orchestration, and more.

Notes:

Highlights from this week’s conversation include:

  • Johnny and David’s background in working together (1:56)
  • The background story of Estuary (4:15)
  • The challenges of ad tech and the need for low latency (5:44)
  • Use cases for moving data at scale (10:35)
  • Real-time data replication methods (11:54)
  • Challenges with Kafka and the birth of Gazette (13:54)
  • Comparing Kafka and Gazette (20:22)
  • The importance of existing streaming tools (22:28)
  • Challenges of managing Kafka and the need for a different approach (23:40)
  • The role of compaction in streaming applications (26:54)
  • The challenge of relaxing state management (34:01)
  • Replication and the problem of data synchronization (36:48)
  • Incremental Back Fills and Risk-Free Production Database (46:03)
  • Estuary as a Platform and Connectors (47:45)
  • The challenges of real-time streaming (57:56)
  • Orchestration in real-time streaming (1:00:51)

 

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.