The PRQL: Kaskada Serving as a Recommendation Engine with Davor Bonaci of DataStax

May 29, 2023

In this bonus episode, Eric and Kostas preview their upcoming conversation with Davor Bonaci of DataStax.

Notes:

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.co

Transcription:

Eric Dodds 00:05
Welcome to The Data Stack Show prequel where we replay a snippet from the show we just recorded, Costas, are you ready to give people a sneak peek. Let’s do it. A fascinating conversation with Davor of Kaskada, which was acquired by DataStaxs, because it’s really interesting story about what they envision in terms of Kaskada being integrated into data stacks, you know, which, you know, sort of operates a lot of stuff on top of Cassandra. So lots of cool stuff there, I think, for the future. But Kaskada is also open source. And it does a lot of interesting things in terms of making it easier to not only discover interesting potential features, and datasets, but also, like deliver those and serve those, which is really interesting. One of the things that I thought was fascinating about this conversation, was the decision to essentially create a new language as part of the system, because the system in and of itself, is capable of doing some really interesting cool things. But they chose to sort of write a language that this is, you know, probably a really, a really bad way to describe it. But it’s almost a mix between SQL and Python, right? It’s declarative, but it’s the flavor of Python, which I thought was fascinating. And so it is, it really does seem like they’re kind of meeting in the middle of these two worlds of sort of the operational side, and more of the statistical side. So that I don’t know. That’s it. That was a fascinating approach. I’m certainly going to going to be thinking about that one. What stuck out to you.

Kostas Pardalis 01:55
Yeah, kinda percent, I think, like the most, like two things, like you’d like from this conversation, one has to do with like, building the technology itself, and like how big of a problem it is, and why it’s not something that can be, let’s say, solved, with just like stitching together. technologies, but you really need like to start thinking like in first principles, and build the new system, in a way, right. That’s one thing. But that’s, let’s say, the bread and butter away. innovation, technology rights? Well, I felt like, extremely interesting is how important the user experience also is like, and that’s what’s the connection with what you’re saying about the language? Like, the reason they ended up building a new language is because they were trying to figure out what’s the right way for our users, in this case, ml engineers to interact and work with the data. And somehow, like God railed them into figuring out what’s the signal out of all this noise out there, right? And exactly what you said like it’s, they had to find the good things, from all the different politics, jobs that are out there, and put them together in a way that feels like native to their user, which is the ML engineer, right? And the ML Engine, yeah, yeah, leaves in by some lungs. They use Python, like you cannot change that all the libraries are in Python, no matter like how they work with the data, when they will have to do some processing with the data, Python will be needed. So it is important to build the right experiences there. And we see that like this experience, the needs for these experiences also drives new innovation, like building in New languages on top of the processing system that we have. So that’s something that like I see, I think we will see more and more of in the infrastructure space, as we tried to make like, democratize access to all these technologies, which is probably something that will get even farther accelerate because of all the recent developments with AI ml and all that stuff. So yeah, like let’s well I keep and looking forward to each other again, and see what comes out from putting Cassandra together. We’ve got them.

Eric Dodds 04:39
Absolutely. Well another good one in the books. Thanks for listening to The Data Stack Show as always, and we will catch you on the next one.