Episode 43:

Doing MLOps on Top of Apache Pulsar and Trino with Joshua Odmark of Pandio

June 23, 2021

This week on The Data Stack Show, Eric and Kostas are joined by Joshua Odmark, the co-founder and CTO of Pandio. Pandio is built on Apache Pulsar and is designed to help companies achieve their AI and ML goals.

Notes:

Highlights from this week’s episode:

  • Joshua started his first company at age 15 and then sold two more startups after that (2:15)
  • Embracing the open source movement and not reinventing the wheel if you don’t have to (12:15)
  • Pulsar seemed built to address Kafka’s weaknesses (17:23)
  • Using Redis as a coordinator for federated learning and taking advantage of its portability (23:05)
  • The pillars of Pandio and some practical use cases (31:24)
  • Feature stores and model versioning (38:23)
  • Seeing Pulsar as the future because of the ability to run tens of millions of topics (41:04)

 

The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.