In this bonus episode, Eric and Kostas preview their upcoming conversation with with Alex Merced of Dremio.
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.co
Eric Dodds 00:05
Welcome to The Data Stack Show prequel where we replay a snippet from the show we just recorded. Kostas, are you ready to give people a sneak peek? Let’s do it. Well, the thing that struck me was the emphasis on openness, which I guess makes sense for a tool like Dremio, you know, where they need to enable multiple technologies. But a lot of times, you’ll hear technology companies be a lot more opinionated, you know, like this, we are doubling down on this file format, because of these really strong convictions. And it was just really interesting to hear Alex say, you know, it probably works best with Parquet. But you should try to query a bunch of other stuff with it. And then that’ll work. That may not be a bad, you know, the most ideal experience, but I appreciated that openness. Right? And it seems like that’s sort of a core value of the platform, at least as we heard from Alex. And so I thought that was really neat. And honestly, I think is probably pretty wise of them, even though they’re, you know, obviously, I think a lot of their customers are, are well served by the Parquet format. But the fact that they seem to be building towards openness, I think is probably pretty wise for them as a company as well.
Kostas Pardalis 01:23
Yeah, 100%. I mean, I don’t think that you can be in let’s say, the space of the lake house or the data lake without being and I think that’s like, the whole point. That’s how like it did Alex started as a concept, like compared to a data warehouse, where you have like the opposite, that you have, like an architecture that is like close do you have like a central authority look like? optimizer is like every decision and have hotline total control over that. And, okay, the data lake is the opposite of that is like, okay, here on like all the tools to figure out how to put them together and optimize them for your own like use case. Right. So obviously, there are like pros and cons there. Yeah. I have to say, though, that openness is a little less, I think, like easier in this industry, primarily, because the things that you have to support are not that many. Right like gate, if you compare the number of front end frameworks that we have compared to how wild Spock was, we have for like columnar data as like, you kind of like compare that right. And there is a reason behind that is because it’s a different type of problem. And because like, more limited, let’s say probably like set of solutions. It’s so low that something that’s easier also like to achieve and maintain. Yeah, but this doesn’t mean that it will cut right? If you are going like to productize it’s one thing I like to well, just other things like the product. So yeah, it’s it’s very interesting. I really want to keep what Alex said about the catalogs and the importance of catalogs. That this year is going to be an important run. Share a lot about that. And yeah, like hopefully, to have him again like in a couple of months and see like how things are progressing and not just for drain you but for the whole industry in general. We will have him back on.
Eric Dodds 03:37
Thank you again for joining The Data Stack Show. Subscribe if you haven’t tell a friend and we will catch you on the next one.