The PRQL: How Does Composability in Data Infrastructure Differ at Different Levels of Abstraction? Featuring Pedro Pedreira of Meta

November 27, 2023

In this bonus episode, Eric and Kostas preview their upcoming conversation with Pedro Pedreira of Meta.

Notes:

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com

Transcription:

Eric Dodds 00:05
Welcome to The Data Stack Show prequel. This is a short bonus episode where we preview the upcoming show, you’ll get to meet our guests and hear about the topics we’re going to cover. If they’re interesting to you, you can catch the full length show when drops on Wednesday. Kostas, this week’s conversation is with Pedro from Meta, and wow, what a lot to talk about. V locks, of course, is a huge topic. And his work on that and usage of that inside of meta, which is fascinating. So execution engine that does a ton of stuff inside of meta, the Pedro’s really an expert in so many different things, databases, you know, sort of architecture of data infrastructure, so much to talk about. The thing that I’m really interested in is this concept of composable has been a marketing term in the data space as it relates to sort of, let’s say, like higher level vendors that you would purchase to, you know, sort of handle data in an ingestion pipeline, or an egress pipeline, right. But Pedro has a really interesting perspective on this concept of composable, at a lower level of data infrastructure, and the execution engine is really sort of a foundation for that. And so I’m in this, I think this can actually help us sort of cut through some of the marketing noise that the vendors are creating with higher level tooling and help us understand, you know, at the infrastructure level, what does composable mean? So I think that would be an awesome subject to cover. Yeah,

Kostas Pardalis 01:55
I think, Okay, I think you put like, in the right way, because I think the difference here, when we’re talking about composability is what level of abstraction we’re talking about, when it comes to composability, like the vendors you’re talking about, I think they are talking more about like composability of like, let’s say features in a way or like functionality of like the user wants composability when it comes like to what we were like discussing with bedrooms, like a little bit more fundamental, and has to do more with how software systems are architected. And, okay, I’m sure like, people that listen to us, like, they, they know, the value of being able like to build a system software system that is, has like some kind of like separation of concerns between each module to have much more like agility and flexibility and like building updating, like, having people that are dedicated, like to different, like areas, and in general, have something that scales much easier in terms of building, right, like we’re not talking about, like processing scalability. Now, traditional disease was living like the case with de la based systems, though, right? Like database systems were like, kind of like big monoliths in a way. And like, for some very good reasons, they could cause a lot to do with like, how hard it is like to build such a system. So when we’re talking about, like, the composability, that we’ll be discussing with, with better is more about that, like how there are like some new architectures and some new components coming out, that actually allow, like a developer to pick different libraries, and build databases, right. Or like data processing systems, let’s say in general. And that’s a very important thing in the industry, because traditionally, building database systems has been like extremely hard exactly, because like, there was like, No, almost zero concept of like, reusability of like libraries, or software or whatever, they pretty much had like to do everything, like from scratch. And that made like, the whole process of building these systems like really hard. And also like from a vendor point of view, right. So we are entering like a new kind of, like, era, like when it comes like to these systems where with technologies like Arrow, for example. Velox we start seeing, let’s say, some fundamental components that you find in every system out there provided as like a library in a way that you can take and like integrate and build your own system. Right. So this is like the things that we are getting like to talk about when we’re talking about like composability with him. But there was much much more actually we’re going to start a lot about like some very basic and important concepts when it comes like to data processing. And by the way, like Pedro has been working like 14 years in meta data infrastructure so he has seen that a lot like in this past like 10 years like this, like so many things have changed and they were like build so we We are going to talk a lot about like the evolution, things that 10 years ago like were innovative and today they need to be rethinks. And that’s a hint because he’s going to announce some very interesting things also about some changes and some updating like some very important systems out there. So Velox is a very interesting project, but also like involved some like amazingly experienced people. We started with Pedro today. Anyone, everyone, like should listen and to Pedro to enjoy and take a glimpse of the future of like, what is coming, and hopefully we’re going to have like him back and also like more people relate to these technologies to talk more about the stuff in the future.

Eric Dodds 05:45
All right, that’s a wrap for the prequel, the full length episode will drop Wednesday morning. Subscribe now so you don’t miss it.