Episode 49:

MLops – The Finalization of the Data Stack with Ben Rogojan of Facebook

August 18, 2021

Kicking off season three of The Data Stack Show, Kostas and Eric converse with Ben Rogojan, perhaps better known as Seattle Data Guy. In addition to being a data engineer at Facebook, Ben is also a consultant for a number of smaller organizations and shares his insights on some of the unique issues data engineers face at companies with such a vast difference in scale.

Notes:

  • Ben’s background and his shift to data engineering (2:19)
  • Trends in the data space: finding the most efficient tools, the Snowflake phenomenon, and keeping up with new functionalities (5:33)
  • Key differences in data practices in small companies and Facebook-sized companies (12:38)
  • Having to build tools specifically designed for Facebook because of SaaS product limitations (16:00)
  • Team structure at Facebook (18:17)
  • Developing more robust systems that are resistent to pipeline failure (19:50)
  • Defining data stacks (24:01)
  • A sample data stack for a young company (28:37)
  • Why Redshift and Snowflake have trended in the opposite direction (33:02)
  • BigQuery and Snowflake comparisons (36:06)
  • MLOps and whose responsibility is it (39:12)
  • Feast, Tecton, and feature stores (45:40)
  • Having a good community around an open-source product (49:30)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.