This week on The Data Stack Show, Eric and John welcome back Lewis Dawson from Momentum Consulting for the third part of this conversation on data attribution. In this episode, the group talks about the intricacies of machine learning (ML) and its business applications, particularly focusing on customer lifetime value (CLV) prediction. They discuss the challenges of implementing ML models, emphasizing the necessity of having the right data and team. The conversation highlights the importance of simplicity in modeling and understanding the limitations of data-driven approaches. They also explore the concept of diminishing returns in marketing channels and the potential of AI in optimizing advertising strategies, while acknowledging the complexities involved.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
John Wessel 00:03
Welcome to The Data Stack Show. The Data Stack Show is a podcast where we talk about the technical, business and human challenges involved in data
Eric Dodds 00:13
work. Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. Okay, let’s talk about machine learning. You know, it should probably lead into AI. And I want to frame this conversation Well, actually, let’s talk about machine learning. And then I do want to ask about, you know, when should you start to apply some of these more advanced tech techniques? But how does machine learning enter into the picture here?
Lew Dawson 00:50
Yeah, and I definitely have a little bit less experience in this one, but you can use machine learning algorithms to, like, dynamically wait or change weight, or choose an optimal weight, depending on where you train it on. That’s what I have seen done. I haven’t done that a lot, though, so I’ll fully admit, like, if you have additional insights, like, fire away, yeah, I haven’t
John Wessel 01:15
done it. I haven’t done it a lot either, I think. But one thought actually, back to the customer lifetime value thing, like, I have a friend that’s really deep on that particular topic of, hey, let’s like, take this data, predict customer lifetime value, and then, you know, essentially go back and think about like, which conversions are worth more to us than others based on those attributes. Yep. So that’s less about modeling, I guess, but it can impact the modeling of like, okay, well, it’s fascinating, really. Like, okay, look at this. Like, pattern, yep, results in higher customer lifetime value, therefore we want to model based on that pattern. I think that’s a phenomenal application of it, yep, but I haven’t seen it in the wild. Like, yeah, much if I’ve
Lew Dawson 01:58
never seen that one, like, I’ve Yeah, I know the theoretical is on it, like, I tried to play around with it once, but it, it seemed so complicated and so advanced, yeah, getting back to the point it’s like, right? It’s a cool theory from my perspective. But I’ve never, like,
Eric Dodds 02:12
right, yeah, right, yeah. I’ll tell you my experience on this. I mean, you know that we’ve done internal testing around, you know, running Markov chain analysis and other things like that, just because, you know, I’m the type of person who generates synthetic
John Wessel 02:27
events, yeah. But markup
Lew Dawson 02:29
chains, I love it. Markup chains are fun, cool.
Eric Dodds 02:32
But generally, when a business is really using machine learning to, like you said, sort of dynamically assign weights and other things like that. They’re generally spending a lot of money, to the point where, if they can squeeze more optimization out of it using a very advanced machine learning technique, it’s worth it, because of the amount of money that they’re spending and because they’ve, they have exhausted, you know, most of the easy optimization opportunities. But the other flip side of that, that is actually just, I look back on the entire conversation, like several hours of conversation we’ve had up to this point, it is a significant investment to get to the point where you have, like enough of the right data and a good enough understanding of your business. And the other thing I would add is another big one, like some level of stability in the business model that you want to optimize, true for that to be worth it, right? And so that generally is happening at a scale where there’s a lot of money, it’s a fairly large business, and a lot of times, what’s happening in those situations is an agency will have, you know, or, you know, software will have a proprietary model, or they’ll do media mix modeling or whatever, you know, and so they are essentially purchasing that capability from a vendor, right? You know, at that point, because a lot of times, agencies are managing a lot of the different campaigns and other things at that point. So that was a very long way of saying, I agree. I haven’t actually seen a ton of it in the wild hand role, it certainly exists, yeah. But the conditions for that to be worth it, I think, are, are, you know, a smaller it
John Wessel 04:29
is a weird situation where the conditions to be worth it are. One thing you need is the like. You need the amount of money being spent through advertising to be a certain amount. You need the like, you said, the data. But you also need the people to stay long enough and to have the right skills and the right spots. Yeah, and like, as you, like, go up in business, as teams are more and more segregated, it’s like, potentially less likely that you would have all the right skills and the right people in the right spot to do this internally. Therefore, like, I agree. Like, I think it’s the most likely scenario to get to a. Budget that’s high enough with some kind of, like, agency group that can do this. And then, like, you’re providing data, yes, for it to be done, essentially, yeah.
Lew Dawson 05:09
And then, I mean, on top of that, you’d also have to prove, number one, prove that it’s that much more valuable. In my opinion. It’s like, go and do it. And then number two, like, how would you realistically prove that? And then number three, is it like picking up pennies in front of a steam roller? And in number four, lastly, like, that’s, would you be biasing? Would you be biased, you know? So would you basically be, would you be overfitting your model at that point, right? Like, is it even worth it, but so like, I think it’d be a big challenge to, like, write a generic enough model that’s accurate for an agency across a wide array of data sets, and then also accurately predict and attribute for like, new data. Yes, right, becoming not good enough or too over fit? Yeah. So, yep, it seemed this one, in short, I’m not trying to knock anyone out there. I’m sure, I’m sure people actually have done it. Super clearly, I can only think of one company I’ve ever associated with where it might have been beneficial, and even then, I probably would have been like, I don’t think it’s probably worth it. And number two, like, I just don’t, I don’t see it as being worth the amount you spent, kind of like John was alluding to, like you’d spend a lot of time and resources writing something that I would argue is probably not beneficial in most cases. But nope, I’ve also not seen everything, so
Eric Dodds 06:38
yeah, and we kind of answered, I think, within that, like, when do you use advanced text techniques, right? And I think you just, you know, comparing the amount of effort and scoping it well, going back to your point, we have the question of people and what you’re actually trying to measure, you know, I advocate for using the simplest technique possible to answer the question, right? Because you can always add complexity. But it’s really about answering a question to help, you know, to help the business. And the faster you can do that, the better.
John Wessel 07:13
Well, you can always add complexity more easily than you can reduce complexity. Yes,
Eric Dodds 07:17
yeah, it’s like distilling complex
Lew Dawson 07:19
problems down? Yeah, it’s almost always more complex. And it’s the other thing that comes to mind just now and then we can move on super quick. But it’s like, you think the fighting over, like, linear bad? What if it’s black box? How do you think the infighting would be black box? We don’t really know. The algorithm just told us, like, you get more credit. Great points. Yeah, well,
Eric Dodds 07:42
In fact, that was very, yeah, that’s a very sweet observation, and in many cases, I think that is a political motivation for having a third party handle it, right? Yeah, sure, because you don’t want to be holding the technical bag, you know, when that fight breaks out? Yeah?
John Wessel 08:02
And like, and, I mean, this stuff is hard, like, is very likely that at some point there will be some error in the right model, right? And,
Eric Dodds 08:10
yeah, yeah. But, I mean, we did talk about things like television or, you know, podcasts are very hard to track. I mean, you know, who goes to the URL? I mean, some people do, certainly, but it’s there. I mean, again, because of the type of people we are, we’ve tried to, like, do, you know, whatever? We don’t do a ton of that, but it’s an interesting analytical problem. And so, right, you think through it. And there are some cases there where, okay, you know, if we have enough data, it would be interesting to run a machine learning model to get a directional sense of whether we believe that this very difficult to measure channel is having enough of an impact for us to justify the budget at a you know, you know, there are a number of conditions there where, you know, again, not trying to apply weights to everything, but you know, where it can be a really helpful tool to answer questions. Well,
John Wessel 08:58
that’s where I think that’s where it’s not even necessarily ml. It might just be statistics. Of, like, essentially what, like, if I want to predict an outcome, so you, I don’t, I haven’t heard predictive analytics in a while. I think it’s, like, out of vogue, but that’s essentially what you’re trying to do, right? I want to predict a conversion or predict $1 amount. So what are the inputs into that, that, and then, like, how are they weighted to predict that? Like, to know that at a high level, or even some great levels, is interesting. But that doesn’t mean you have to deploy a super robust model like that, for every single user. Like, you know what I mean? You can learn that generally and then come up with a multi touch attribution model from it, or just not use multi touch attribution and then just have an idea of like, oh, look, this channel seems to matter more to this other one. Oh, you
Eric Dodds 09:44
know what? Lou, I just remembered something. I’m so glad that you mentioned that, because it reminded me of something I wanted to ask you about that we have not talked about before. But one really good use for a model like that, that I’ve seen employed in the past is it. So understanding when you reach the point of diminishing return in a particular channel, right? So we’ve been talking about, okay, you know, complex multi campaigns, you know, multi touch, etc, right? But a lot of times, you know, as you start to work through this, you’ll realize, Okay, we have a channel that is productive for us according to some model, okay, let’s say last touch or something like that, right? And so, of course, one of the one of the entire reasons that a business does this is so that they can spend more money on the things that are working and stop spending money on the things that aren’t working, right? But one of the interesting things is, when you uncover an opportunity to spend more money on something that’s working the which, this is actually a separate subject that relates to, you know, actually the study of economics, but there is a physical cap to the inventory available for whatever targeting parameters you have and whatever product you’re offering, right? And so one of the challenges you have is, okay, I can increase my spend in this channel to produce more conversions, but that’s not linear. It’s not infinitely linear, right? At some point you know you either reach the physical limitations of inventory or for whatever reason, you know the effectiveness as your spend falls off. What do you think about modeling in a particular channel? Because that’s really important as well, right? It’s not just like, we’ll just, you know, keep doubling the budget and keep getting more conversions, because it doesn’t
John Wessel 11:31
work that way, right? I
Lew Dawson 11:33
I mean, there’s a fantastic question and observation. I will keep this broad for now, because this is, this can actually be a whole topic in itself, but if you think about it at a high level, this is effectively, it’s an optimization problem. And if you go a slight click down, this is effectively like a gambling or an investment problem, right? So not advocating gambling or investing here, but if you think about it from a mathematical point of view, this is a very common problem in gambling and investing. So I have a basket of goods that I’m interested in, and I have a finite amount of resources. What’s the optimal allocation of my money to those resources to get the highest return in most cases on that allocation right? So, like, a Kelly criterion, for example, is, is a fairly common and that that if that was trying to be solved, which a slight sidebar, like, I haven’t seen many customers get to this point where they’re even trying to optimize like this like to this degree, because usually it’s usually most customers I’ve seen don’t even quite get this far, but at that point, it would start to get into some complex mathematical models, like criterion, where you figure out, Okay, here’s my window of opportunity. Here are all the things I could invest in, and how much money I have. Like, let’s look at my returns over time, and, you know, start doing predictive modeling, like John was saying, and figure out, like, what’s my optimal allocation for a given scenario at a given point that’s at a high level. That’s how I would approach that. Yep, yep.
Eric Dodds 13:17
Man, I feel like we could just keep recording such an interesting subject,
Lew Dawson 13:22
Maybe, maybe we should keep a list of future ones. Yes,
Eric Dodds 13:26
yes, yes. Brooks is furiously taking notes.
John Wessel 13:30
But I think just one other quick thing on this topic is, you’re talking, I would get you were talking about saturation. You know, channel saturation, right? Then there’s the other question of like, how much do I invest? The other question is, like, if I changed something, like, did I alter my saturation ability? So, say, I launched a new product? Like, is that channel I thought was saturated? Not saturated anymore? If I made some other major change, yep. So it just gets more and more complex.
Eric Dodds 13:55
Yeah, totally. Well, I mean, I guess the last point on that, and we’ll just keep, we’ll just this is going to be like an 18 part series. But the other thing, and we’ve talked about which actually, we can just jump right into it after I make this last point, because I want to talk about measurement and reporting and where we start there. But the other thing that we haven’t talked about is the creative aspect of all of this, right? I mean, to some extent you have to get the right combination, not to some extent, you do have to get the right combination of targeting of creative of you know, matching the product to the you know, to the audience, in order to have the opportunity to start to saturate, which in and of itself is very difficult. Can often take a lot of experimentation, which requires a lot of measurement, and so that you know that aspect, in and of itself, is very difficult, right?
John Wessel 14:43
Well, I was actually viewing that like there’s a practical saturation, a theoretical saturation. So let’s say you have bad creativity, you get a practical saturation really fast, right, right, right? Where, if you fix the creative, then your theoretical is way higher. Yes, yes, yeah. Just as an example, personally,
Eric Dodds 14:58
practical. Theoretical saturation. You look, you look, I love it. Any comments on that before we talk about measurement and reporting? Yeah,
Lew Dawson 15:12
definitely pieces. That’s why it’s so important, once you get to a certain point, to do that ad level reporting, because that’s why, that’s ultimately why you have ad sets, and ads, right? If you have a theory you have. So your theory is, if I run this campaign, I will convert these users. I’ll convert and use the more official terminology. I’m going to convert this audience. Now, within your ad set, you have subsets of your audiences, so segments, and you’re going to then, within your ad set, run different ads to different segments. So pick an arbitrary example, like, let’s just use age. That’s an easy one to visualize. Like your audience is filled with people at different ages. You’re going to segment them by different ages, and then you’re going to, in your ads, you’re going to run different content for those different segments, those different age ranges, in hopes that you’re going to connect with and relate to those different segments, and they’ll be more likely to buy right? So 100% you’re spot on. We didn’t even talk about that, but that but that is a whole nother angle, both of your strategy on how you drive campaigns, but that’s also a whole nother angle of how you need to measure your performance, which you kind of are alluding to too, is, are you tracking the content that you ran with each ad, such that you can then correlate like that content which includes both visual and copy back at some point. Like, if you’re complex enough, you can track all that back and include that like John was saying earlier in your predicted model, or at least in your performance, so you can figure out, like, what content is resonating best with which audiences? Yep, so yes, that’s yet and a whole nother area,
Eric Dodds 17:04
yeah. Well, I mean, that’s one of the big reasons we wanted to talk to you about this, because you’re one of those rare people who, I mean, you just gave an unbelievably succinct explanation of the job of a performance marketer. As you know, they are practical day to day. I am targeting these different audiences. I’m representing those in, you know, some different segments or ad sets, depending on the ad platform. I’m running different content or creative against those, which I think is so helpful for our listeners to understand, especially who are on the other side of trying to report on that. I would also say, just as a practical, you know, you know, having been on, actually, all three of us have, have kind of been on both sides of this. Interestingly enough, that’s great, probably why this is such a wonderful conversation. One of the best things you can do being on the data side to build a good relationship, and I think help solve some of the challenges you see on the people side. I know this sounds really specific, but go build a relationship with your marketing team and understand their naming taxonomy for these things. How do you name audiences? How do you name campaigns? How do you do all of that? And because I guarantee you like, you can provide helpful insight to them, to say, hey, if we restructure some of this, it’s going to be easier for you to understand as it scales. It’s going to be easier for us to report on. I mean, even mean, even just establishing that shared language and the taxonomy and hierarchy does a world of good, and most companies that’s a mess, not because anyone’s trying to be messy, but because marketers are moving fast, doing exactly what you talked about, right? I need to generate a bunch of these ads, and my KPI is right, figuring out this optimization problem as quickly as I can, and that creates a lot of data, you know, detritus. I’m
John Wessel 18:50
gonna put emojis in my campaign, titles every day, the person, yeah, yes.
Lew Dawson 18:59
I’m worried, like,
Eric Dodds 19:03
oh my gosh, yeah. But
Lew Dawson 19:04
like, a great example of, like, some value spaces, some dashes, underscores, like, that’s common, right? Like, the tax commodities are all over the place, taking even one step further, I think that’s phenomenal insight, Eric, it’s also being able to track what was served with a particular campaign. An asset, as I alluded to like that is, as you get more advanced, that actually becomes very important. And the earlier, as you pointed out, you can establish that taxonomy even if you’re not calculating it early on, like, if you don’t care, that’s okay. But if you track it early on, when you get more advanced, later, you can actually go back and look at it over time, and you can actually, yeah, derive, you could derive more value from that if you have, like, 1234, years back, even if you don’t look at it right away. Yep. So totally, I think it’s a fantastic insight, like establishing that relationship and really working together to make your data cleaner. And easier to dissect and use and benefit you more? Yeah, totally.
Eric Dodds 20:04
And even asking, what are we trying to understand here? Because if my goal is to test a bunch of different creative, because I know this is a good audience, I’m just trying to find the right creative is a phenomenally different reporting problem than we’re testing a new channel, and we just want to see if we can get conversions or not, right, right? And boy, can you save yourself a lot of pain knowing that ahead of time, right? Well, and
John Wessel 20:31
I think it makes a difference too. Like what you’re saying when working with marketing, if you’re at a small or medium sized company, you end up with sparse data problems. So let’s say you like it, I don’t need that now. And then you’re like, oh, like, we want to look back. Like, if you can just get on the same page up front, totally. Then you don’t have to wait three months or however long, you know, just for the data to come in, yes,
Lew Dawson 20:52
and to have the historical data too. Yeah, right, right. Totally.
Eric Dodds 20:57
Okay, well, that relationship between data and marketing ultimately materializes, almost always in the form of some for like, some report, right? Or some metrics that are produced. Okay, so we’ve talked about different attribution models, but Lou and John, because you’ve both built this reporting and John, we’ll start with you this time. Where do you start when you think about, you know, if you were going to start to build some and of course, this is based on, you know, understanding the metrics that the business wants. But where are you starting out from a reporting standpoint, and maybe even just go to the level of, you know, if you were, when you did this at your last business, what were the, what was the first dashboard of charts that you produced?
John Wessel 21:42
Yeah. So I think I had a really unique experience where I started out just being on the technical data side, and then ended up having both teams like, like, reporting to me. So I had soda and marketing, yeah. So I’m trying to think back before the market, like, when I when marketing was, like, you know, under a different leader. So I think one of the first, I mean, the first things we did were, I mean, probably Google Analytics, years ago, and, you know, return on ad spend is, you know, the metric that comes to mind that everybody was looking at
Eric Dodds 22:17
any breaking that down by channel. So like, a chart that has returned on ad spend for Google and Facebook and
John Wessel 22:25
Bing was a pretty good converter for us for that, yeah. So yeah, by channel, a lot of our data was like Google Shopping and Bing shopping were, were two of our top performers. So sometimes it would be subset by that. So it’s like, inside of Google, like, what’s our role as for Google Shopping or Bing shopping specifically Yeah? Like, versus Yeah, versus others, yeah. And as far as the optimization, like, we for the longest time, and I got some of it, was probably just fortunate circumstances to our industry, and it wasn’t like a super sophisticated industry in the mid market, but we had competitors, like Home Depot at, you know, and like above us. So we had this interesting balance of like, well, sometimes we overlap and compete with Home Depot, and sometimes we overlap and compete with like, more of a Mom and Pop type place. So a lot of it was just like, Okay, how much money can we spend and keep this like, you know, return on and spend at whatever. Our goal was at a five, yeah, six, Yep, yeah, that’s interesting. It was very high level. And, like, and a lot of the times, like, we didn’t go super deep, because, like, you put money in and it would work, and you’re like, okay, it worked. Like, put more money, and then when it didn’t work, but what? Yeah, but when it didn’t work, that’s when you started to dig, you know, dig, yeah, you know, optimization.
Eric Dodds 23:48
One thing I love about that example is which, again, so many topics we could talk about, but in areas where you were competing with Home Depot, that’s way more expensive because they have very deep pockets, right? And end
John Wessel 23:59
of quarter Home Depot, like, No, right? That’s not it. Literally, you you can lose trying to hit a number, yeah, if
Eric Dodds 24:04
someone clicks an ad, you know, you don’t lose money depending on your margins. But, all right. Lou, same, same question over to you, yeah,
Lew Dawson 24:10
oh, man, I so, I think the first thing came to mind is tracking the common metrics, which I’ll pack in just a second, all levels, or nearly all levels, so And obviously this depends, again, on the complexity of your business, but you want to know. You certainly want to know how you’re performing at a channel level. But you also want to know, like, how your campaigns are performing at time, yeah, how your ads, ad sets, but also ads are performing over time, right? So you want to know, you want to compare your like the conversion of your ads and you get to each other at times, to know, are those resonating, right? Like you want to understand. So. Or like you’re, if you just look at a campaign level or channel level, it might be, it might be masked that, like some of your campaigns, or one of your campaigns is performing really well, or when your ads is performing really well and the other ones are just tanking. Right? Yep, right. So you definitely want to win possible. And I reside abroad first. I’ll go more narrow, but you want to ideally track them across various altitudes and Yeah. Last thing on this topic, like you do also want to provide a high level of view, like a summary for executives, because they need to understand as well, yeah, like, what’s the performance of week over week, month, week over weeks? Not maybe for executive levels, okay, but they want to understand, like, how am I performing? To plan, how am I performing over time? Like, is it make does it make sense to continue to spend the amount of money we’re spending on acquisition, etc, right? So summary, all the way down to ad level. Now, specifics on that you obviously want to look at. I mean, one of the one of the drivers of a business is how much revenue, or more specifically, like, how much profit you’re generating, right? So you obviously want to look at how much profit Am I generating specifically, right? And so return on ad spend is one way to look at that. For sure. There’s not, it’s not the only way. But you absolutely want to look at something related to, like, how much, how much profit am I actually generating, right? So there’s the nine, the somewhat naive, like, Okay, I’m spending x on ad revenue, and then I’m getting back y in profit. Or actually, revenue is more common. That doesn’t always necessarily represent the full picture, though, if you think about it, because revenue doesn’t actually equate to how much money you made, right? Because you had to, you have cogs, right, that cost of goods sold, yep. So, right? More advanced businesses, ideally, you actually want to factor that into like, how much at the end of the day after you know your cost of goods, how much am I actually making on ads, right? Another angle to look at for this, in a lot of cases, is like, what’s the quality of traffic that I’m acquiring? And you look at that from the angle of the customer and it’s what is the customer lifetime I’m acquiring, the customer lifetime value I’m acquiring by like, channel, campaign, etc, right? So is a particular ad set, campaign, etc. Channel, is that acquiring quality customers or terrible customers, right? And you can assign like, what quality customer was a terrible customer by lifetime value. So you can look like customers are spending a lot of money. That’s probably a good quality customer. And then you can associate that with, like, your acquisition campaigns, and you can factor that into your measurement of, okay, like, this channel is making money, but like, we’re acquiring terrible customers who actually are not providing much value to negative value to this company over time, like, if you you know you’re giving loss leaders, you’re selling loss leaders, or you’re having to give too many discounts. So those are a couple ways I would think about measuring this at a high level. Yep. Did you want to go more specific? No, that was
John Wessel 28:20
you reminded me that we did. So we started with ROAs, but we quickly, and this is funny, we had some very basic like true ROI that included Cost of Goods Sold calculations that when I first was working with a group like, we dig into it, and we realized that somebody, at some point, had hard coded a single percentage to calculate all this. So they’re just like, oh, well, like, in general, like our, you know, gross margin is 30% or 25% so just hard coded, and this was for, like, I don’t know, 30,000 items. That’s amazing. So that, yeah, so that was why, so we updated that to be, you know, is that we could per item, and that that made a big difference.
Lew Dawson 29:02
I would hope so. Yeah,
Eric Dodds 29:03
yeah, across 30,000 students, right? Insane. Okay, well, we would be, we would be remiss not to discuss AI. Oh, yeah. So Lou, tell us your view on like, how is AI impacting this, this whole situation. I mean, one of the things that comes to mind immediately is that, especially as it relates to, you know, someone thinking about creating reports around this data, AI, is making it way, way faster to turn creative assets around and run experiments extremely rapidly, because it’s essentially removing the need for, you know, turnaround time across teams, etc. I mean, they’re entire platforms, entire platforms that literally do this, right? I mean, you can essentially describe creative, and it will just, it will generate all these variants for you, and some of them, I think, even, like, run the campaign and give you, like, early touch results and stuff, which is crazy. So that’s one. Why, but what are other ways?
Lew Dawson 30:02
Oh, man, you touched on the one I immediately had coming to Mike. So yeah, definitely. I think there’s two aspects of what you said. There’s copy generation, there’s also, like, visual generation, but then that’s taking that one step further. It’s using and in some cases, like using a rag or llms to actually input a lot of your customer data, so like customer Feature Table and tell me, how can I best slice up my audiences to then send them to, like Indian campaigns or specific platforms, like a retention platform, acquisition platform, and then how, based upon what we’ve seen in sales, can I optimally choose Copy and or visuals or either that audience and or segments within that audience to best with with the highest probability to convert them right? So you can use AI for that kind of stuff. That’s another thing that comes to mind.
John Wessel 31:00
I think here’s what I will use this weekend. And I hadn’t thought about this. I was working on a project at home, and I used chat GPT to look for, like, a certain, like style door that I wanted. And it produced results, you know, like, like Google would have inside of it and clicked on one of them. And then I’m thinking, and then now I’m thinking, like, how was that attributed? I wonder, like, how like, like, I wonder, how do you know, yeah, how does it get attributed or not?
Lew Dawson 31:28
Yeah, I wondered that too. And actually, like, I played around with that a few times. And usually when you click on a link and chat GPT, like, if you ask it for a source, it depends. I think it’s UTM, underscore, source, it’s chat. GPT, oh, okay,
Eric Dodds 31:44
yeah, yeah, go
Lew Dawson 31:46
hook on. One, yeah, I’ll have to do that for Yeah, interesting. UTM,
Eric Dodds 31:50
yeah, okay. I mean, I’m sure perplexity and other, yeah, they have their right, yeah. I mean, that’s self that. I mean, I say self serving. It’s wise of them to do that, right? Then they can show people naturally as they pick up their analytics, like, Oh, I’m actually getting a lot of, you know, results from perplexity, or GPT or whatever, yeah.
John Wessel 32:09
Well, and then the other thing on the AI side of this that, I mean, I’m really interested on the, you know, on the customer data side right of like, how can we, you know, score? I mean, there’s like, customer lifetime value stuff, or lead scoring, or, you know, things like that. It seems like, as we get more robust data and better AI models, that’ll be more easily possible, because it’s possible now with ML, it’s just a pain, right?
Eric Dodds 32:33
Yeah, I think that, I mean, actually, we haven’t talked a ton about this, but I’m interested in it because we’ve talked around this. But to that point, I agree, like in theory, right? Llms excel at sort of, you know, next best action, right, or right, or completion, right. And so, you know, theoretically, if you give it a bunch of inputs in a bunch of context, and even, like, example, sequences, right, it can process that and then make a good recommendation, okay? The big challenge with that is actually just the practical nature of executing the movement of a user into different campaigns and or the sending of different messages through different channels based on what that next best action is, right and so very practically, at least from what I’ve seen, you have marketing platforms that are providing this functionality within their ecosystem, which can be really helpful, but creates local optimization, because the customer journey spans a lot of different channels, right, and we know that from a data and reporting standpoint, because everything we just talked about is getting all of this different data from all these different channels right, and then trying to understand, you know, where conversion is happening, and what’s, you know, contributing to that, but if you Think about actual execution of that, right? And so let’s just say, let’s just say we run an analysis on our attribution data, and we realize, you know what, like a multi touch pattern that tends to work really well is, we’ll just use the example you gave Lou, where someone finds us on search, and then, you know, and then they, you know, convert on unpaid social on Facebook or whatever that is, right? And so there’s this promise of like, you know, and let’s say, you know, you’re way more complex than that. But there’s this promise of like, okay, well, I can feed an LLM in this data. It could actually pick up the signal, and then, at a very large scale, automate the, you know, categorization of users into like, Okay, now you need like, let’s stop spending. Let’s stop showing you search ads, and let’s start showing you, you know, paid social ads, which is, you know, manual cross platform or whatever, right? But there’s not, like, an orchestration data layer connected to all these APIs that actually allows you to, like, automate. The execution of that, which is really hard. And in fact, to some extent, that’s because the vendors themselves, yeah, the vendors themselves don’t want, I mean, they, you know, they want to optimize it in their own systems. They’re not, they’re going to open their APIs, because at that point they’re just, you know, a last mile delivery service. And, you know, you’re sort of, you know, an API endpoint. Anyways, that’s
John Wessel 35:20
your third product idea, but, well, I actually have, I
Eric Dodds 35:23
I know that’s funny, but I’ve, and that’s why I wanted to ask you, Luke, because I know you’ve studied this problem and it’s so interesting, because I think a lot of the componentry is there, yeah, but the API is, isn’t yet. Platforms, sometimes ad platforms, marketing Well, or marketing at a large company, you know? You think about, Okay, we have a marketing tool that’s like sending emails and stuff, that’s usually multiple tools, right? And so the API support and even system design, those systems don’t conceive of, you know, and they don’t want you to have a user journey that crosses their tool and like multiples, right? And so anyways, I am very long winded in this, but Lou, I’ve actually spent a lot of time last week thinking about that after a conversation with the customer, just learning about their issues. But what’s your take on that? Because you’ve conceived as some pretty, pretty wild, you know, systems along those lines as well.
Lew Dawson 36:20
Yeah, yeah, it’s a great observation. There are things you can do to optimize that to a degree, using AI and LMS. So for example, Google ads, for example, they do expose. They do expose, like, turning ads on, turning ads off, changing the budget, etc, and things like that. So from that perspective, like, if you think about again, back, I have some background in this, like, if you think about it as, like an investment problem, a little bit where you’re getting signal from, like an AI algorithm, whether it’s MLM, etc, and you’re analyzing the data that comes on click stream, or whatever you theoretically you could, like, stitch those two together so you could stitch my modeling, either in real time or frequent batching together with, Okay, I’m going to make, I’m effectively going to make reallocations to my ad portfolio. I change the budget for different ads depending on how they’re performing right now, what the LM is saying, or the ML model is saying in terms of what I should be allocating my spinning to, right so there definitely, you definitely could do that like that. Definitely another use for email, AI, I think that’s extremely complex. And I, personally, I’ve never seen this like super successfully done, because I would equate this similar to, it’s like, it’s like day trading a little bit with an element which I have a lot of experience in, and that is tough. Like, that is really tough. I only Yeah, because that’s a really tough problem to solve and do well, right? Like, there are very few companies that do that well. And if you think about it, the market is so optimized and it’s somewhat unpredictable. It’s very challenging to gather all the pieces of that together, like all the data points, like I wouldn’t recommend, but that is another application, as you kind of highlighted. And then I think the other challenge, like in closing, that you pointed out, is there are certain portions of some of these ad platforms that they don’t open up, or there are multiple platform integrations that you have to fuse together in order to really successfully be able to do this in real time, or do real time in an automated fashion. So like, if your contents are out of a CMS. Like, how do you get your content out of your CMS? How do you associate your CMS content with a particular ad? What if you need to change it in real time? How do you change it in real time? Then re-upload it? And, yeah, right, like that. Yeah, you highlighted that the challenges quickly become plenty. Shall we say? Yeah. And Well, the good news is that it’s going to probably take most people enough time getting to basic, linear attribution, multi touch attribution reporting, that by the time you nail that down, someone will have built the a, you know, the API data layer driven by an LLM, right, you know, yeah, that you can just plug all of your wonderful attribution data in, And then you’ll be off to races. Yeah, perfect. Easy, all right, you said it right, like, let’s just go do it.
Eric Dodds 39:26
Yeah, perfect, right? Yeah, easy, a couple sprints. Yeah, exactly, just a couple sprints. Yeah, just a couple sprints. Well, Lou, this has been absolutely amazing. One of my favorite things is that we’ve uncovered a number of other topics, you know, identity resolution, and you know the Feature Table, you know to dig into, among other things. And so we’d love to have you back on the show for another marathon. This has been great. I’ve learned a ton. I think our listeners have learned a ton. John, your experience and insight has been really helpful. So this is awesome. Let’s pick another multi hour topic and dig into it. In Yep,
Lew Dawson 40:00
sounds great. Yeah. John, Eric, thank you both for letting me come on. I really appreciate it. It’s fun to chat with you both.
John Wessel 40:06
Yeah, thanks, Lou. The
Eric Dodds 40:08
Data Stack Show is brought to you by RudderStack, the warehouse native customer data platform. RudderStack is purpose built to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
To keep up to date with our future episodes, subscribe to our podcast on Apple, Spotify, Google, or the player of your choice.
Get a monthly newsletter from The Data Stack Show team with a TL;DR of the previous month’s shows, a sneak peak at upcoming episodes, and curated links from Eric, John, & show guests. Follow on our Substack below.