The PRQL: Are Marketers the Worst Data Quality Offenders?

October 7, 2022

In this bonus episode, Eric and Kostas preview their upcoming conversation with Gleb Mezhanskiy of Datafold.

Notes:

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.co

Transcription:

Eric Dodds 00:05
Welcome to The Data Stack Show prequel where we talk about the show that we just recorded to give you a teaser cost us this was a super interesting one talking about data quality. I guess the first question I have is actually related to an accusation that you made, which is that marketers are generally one of the root causes of data quality issues in the data stack. What say you? Yeah, dude, like,

Kostas Pardalis 00:28
just think about like alternate tools and like products have been built out there. It’s not like that was quite like the needle. Market tears. Like, let’s say, is everything like,

00:40
you can never satisfy us. And yeah, and you cannot get satisfied. So I do I just speak the truth, you know, that? No, it’s

Eric Dodds 00:51
so true. It is so true. I do think led me to a really good point, though, in that he was really worried about data being messy, right, you know, especially things like event data, you know, which marketing relies on Super heavily. And one of the, one of the concepts that he talked about that I think was really helpful. And for any one of our listeners who’s interested in data quality, you’re going to want to catch the show, to hear gloved talk about classifying data quality issues in terms of, like, we mess something up, or they mess something up, which I thought was super interesting. They mess something up are things out of your control, right? So you have like a scheduling challenge in your orchestration tool, or something along those lines, right, which is sort of as a problem. And I think lab did a really good job of articulating why a lot of the issues are actually like we problems, like we made some sort of mistake, you know, whether that was, you know, not doing a good enough job on QA or building an extensible enough system or whatever. And that really informs I think, the way that he has built data fold as a tool. So that really stuck out to me, but what do you think?

Kostas Pardalis 02:03
Yeah, absolutely. I think like, it’s a really interesting framework to these use, in order like to capture the complexity of such a complex row problem, like data quality, because you need like, some kind of model and framework to tackle like the problem of deadlock world. Okay, maybe it’s a little bit like dangers in some environments, the we and them situation might get older, wiser. But it definitely worked well, like for him who’s building like a product and the company. And he needs like to create, let’s say, at the end, like product experience, but they are going to be consumed, like by different very different, like people from data engineers to like people who are consuming the data of the end. And probably they’re not like technical at all right. But they are part of the problem and part of the solution to have dellacqua. So yeah, it makes a lot of sense. I don’t know, like I really enjoy, like the conversation, I think it’s like, for me also gives like, more joy, the fact that they have a very developer focused approach, like the products of their building. So that’s something that like, I find all this like, very interesting. Yep. I mean, at the end, yeah, like, Can like data quality is very equal to face. We have like quite a few vendors, like probably we brought here most of them.

Eric Dodds 03:34
Yeah. A lot of the big ones. Yeah.

Kostas Pardalis 03:38
And I think we should do another iteration a couple of months. Like bring colo them again, and with each one of them.

Eric Dodds 03:48
Yeah, absolutely. Well, you’re definitely going to want to catch catch this one super interesting if you’re working on anything related to data quality. If you don’t like it, don’t worry, that’s a week problem, not a day problem. And of course, we will catch you on the next show. Thanks for joining us and subscribe if you haven’t.