Episode 179:

Time Series Data Management and Data Modeling with Tony Wang of Stanford University

February 28, 2024

This week on The Data Stack Show, Eric and Kostas chat with Tony Wang, Graduate Research Assistant (PhD) at Stanford University. During the episode, Tony discusses his journey from China to studying electrical and hardware engineering at MIT, his transition to data processing systems for his Ph.D., and the academic-industry connection. Tony shares insights on cloud data processing, the limitations of academic hardware projects compared to industry giants like NVIDIA, and the potential for software innovation in academia. He also delves into his current research focus on time series data management, the challenges of integrating different data systems, the goal of improving data processing efficiency, the sales aspect of his research, and more.

Notes:

Highlights from this week’s conversation include:

Tony’s background and research focus (3:35)
Challenges in academia and industry (6:15)
Ph.D. student’s routine (10:47)
Academic paper review process (15:26)
Aha moments in research (20:05)
Academic lab structure (23:09)
The decision to move from hardware to data research (24:43)
Research focus on time series data management (27:40)
Data modeling in time series and OLAP systems (32:01)
Issues and potential solutions for parquet format (37:32)
Role of external indices in parquet files (42:19)
Tony’s open source project (47:11)
Final thoughts and takeaways (49:30)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

🎙 Sign up for The Future of Machine Learning Livestream!

🗞️ Signup for Our Newsletter

Episode 179:

Time Series Data Management and Data Modeling with Tony Wang of Stanford University

February 28, 2024

Notes:

About the Podcast

Sign Up for The Data Stack Show Newsletter