Episode 3:

Turning All Data at Grofers into Live Event Streams

August 27, 2020

In this week’s episode of The Data Stack Show, Kostas Pardalis connects with Satyam Krishna, a data engineer at Grofers, India’s largest low-price online supermarket. Grofers boasts a network of more than 5,000 partner stores, a user base with three million iOS and Android app downloads, and an efficient supply chain that allow it to deliver more than 25 million products to customers every month.

Notes:

Satyam offers insights into how he helped build the data engineering function at Grofers, how they developed a robust data stack, how they’re turning production databases into live event streams using Change Data Capture, how Grofers’ internal customers consume data, and the company made adjustments due to the pandemic.

Topics of discussion included:

  • Satyam moving from a developer to a data engineer (2:43)
  • Describing Grofers’ data stack and data lake (6:41)
  • Who is consuming data inside the company and what are some of their common uses specific to Grofers? (12:03)
  • What are the biggest issues day-to-day as a data engineer? (18:21)
  • COVID’s impact on business practices and the data stack (21:28)
  • The big problem of data discoverability and metadata cataloging (27:44)
  • Completely changing architecture to something that can scale up (33:16)

Satyam leads the consumer engineering team at Grofers and has been with the company for six years. He was the third engineer on staff and initially was a mobile engineer but shifted to data engineering two years ago, allowing him to get more of a 360-view of the company. “I wanted to look at the product from all angles,” he said about the transition. “I had spent a good enough time building that consumer application, but I wanted to see how the users interact with it and what’s the data around it. That always excited me to look at how we are getting the conversions and how the different metrics are getting tracked.”

With the shift from managing a mobile application to data engineering for internal files, Satyam noticed a completely different challenge. “Once you start building internal tools, you’re building for your stakeholders and you get the feedback much faster (than with the typical consumer feedback loop).”