Episode 201:

AI Real-Talk: Uncovering the Good, Bad, and Ugly Through Prototyping with Eric, John, and Matt

August 7, 2024

This week on The Data Stack Show, it’s a RudderStack round table as Eric and John are joined by Matthew Kelliher-Gibson. Together they delve into the current state and future potential of large language models (LLMs) and artificial intelligence (AI). The group discusses the hype versus reality of LLMs, their practical applications, and the challenges of implementing them in production environments. Drawing parallels to the early days of the iPhone, they explore the transformative potential of LLMs while acknowledging their limitations. The episode provides insights into AI’s evolving landscape, emphasizing the importance of understanding token limits, practical use cases, the integration of LLMs with customer data, and more. 

Notes:

Highlights from this week’s conversation include:

  • Current State of LLMs (1:12)
  • Historical Analogy to the iPhone (3:32)
  • Limitations of Early iPhones (5:02)
  • Comparing LLMs to Historical Technologies (6:08)
  • Skepticism About LLM Capabilities (9:11)
  • Broad Nature of AI Innovations (10:12)
  • User Input Challenges (14:32)
  • Transcription and Unstructured Data (16:19)
  • Single Player vs. Multiplayer Experiences with LLMs (18:50)
  • Revenue Insights from ChatGPT (20:27)
  • Contextual Use of LLMs in Development (23:43)
  • Implications of Human Involvement (26:15)
  • The Role of Human Feedback (29:19)
  • Customer Data Management and LLMs (31:25)
  • Streamlining Data Engineering Processes (34:24)
  • Prototyping Content Recommendations (37:42)
  • Summarizing Content for LLMs (39:51)
  • Challenges with Output Quality (41:18)
  • Data Formatting for Marketing Use (43:20)
  • Efficient Workflow Integration (46:20)
  • Exploring New Prototyping Techniques (50:56)
  • Distance Metrics for Improved Relevance (53:00)
  • Improving Search Techniques (56:46)
  • Utilizing LLMs in Customer Data (59:15)
  • Challenges in Customer Data Processing (1:01:10)
  • Final thoughts and takeaways (1:02:12)

 

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcription:

Eric Dodds 00:06
Welcome to The Data Stack Show.

John Wessel 00:07
The Data Stack Show is a podcast where we talk about the technical business and human challenges involved in data work.

Eric Dodds 00:14
Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. Welcome back to the Data section, we have a special episode for you today with just The Data Stack Show team. We actually invited Matt Keller to hear Gibson, also known by his stage name, the cynical data guy, on to the show, in large part because he and I’ve been working on a ton of LLM projects internally at RudderStack. And, of course, we want to make the world a better place by putting another podcast out on the year that discusses AI and MLMs. So we’re doing our part, we’re doing our part s on that. But lots of interesting things to discuss. Matt, what do you think? We have some topics to discuss. But what’s on your mind?

Matthew Kelliher-Gibson 01:11
I think just talking about kind of where we are with LMS in general. And you know, there’s a lot of takeaways about the hype of it, but also kind of what’s the reality? And where is it still? What’s difficult? What are the actual use cases for it? Yep.

Eric Dodds 01:24
John?

John Wessel 01:25
Yeah, I think in addition to it, we talked about this on a previous episode of what I like to call single player mode versus multiplayer mode, and it’d be interesting to explore again, what are we seeing in single player mode, which is like, Oh, look at this cool thing I can do? Yep. And then like, what is it truly to have something in production in an app with hundreds or 1000s? You know, people using it simultaneously? It’s just, they’re not the same problem, right?

Eric Dodds 01:49
Yep. Yep. I love it. All right. Well, let’s dig in. All right, this is great. We can talk about AI with basically no agenda, you know, which is really what the tariffs are really what the people want.

John Wessel 02:03
They may train on this one day. So think about that. Whoa, whoa.

Matthew Kelliher-Gibson 02:09
So I’m just gonna say cowboys are bananas? The throat?

Eric Dodds 02:13
Yeah, exactly. Right. Sure. Perfect.

John Wessel 02:15
Appreciate that.

Eric Dodds 02:16
Let’s sorry, we have some specific stuff to talk about. Matt, I really want to dig in with you. You’ve been building a ton of prototypes internally here at RudderStack. So I want to dig into those and talk about some practical, practical stuff. But I realized that, you know, we’ve gotten a lot of perspectives from different guests on the show around AI. But I don’t think that we’ve discussed that as a podcast team. And so that’s why part of the reason I want to do this episode was just for us to have a discussion about it. Because I really value both of your opinions. We’ve talked a lot about AI. And so we’re going to expose some of that conversation, and hopefully the audience will enjoy following along. Let’s start out by talking about where we’re at with AI. And this week, I was thinking leading up to the show, I was trying to think about an analogy for where to place this in terms of historical reference, right, like, Okay, we’ve kind of seen something similar before when this happened, or this happened. And there have been lots of different, you know, there’s been a ton of ink spilled on the internet about this, and probably podcasts. But the one that really came to mind, in terms of my own personal experience, was the iPhone. And one thing in particular, really stuck out to me. So I thought, okay, let’s think about the iPhone, rewind to 2000. It came out in 2007. And I think I got one that year, I want to say or at a minimum, the place where I worked, which was a marketing agency, like there are multiple iPhones around right. And so he had this high exposure, like very early on. And that was when they do you remember that it’s hard to remember this, but like the app icons, it had a very physical design built into it, right? So there was woodgrain as a shelf for your apps. And you know, all this really interesting, interesting design is very different

John Wessel 04:16
than your BlackBerry apps.

Eric Dodds 04:20
Okay, so you rewind back then. And I think if you would have told me then multiple people in your life will run their entire business from this single device. That would have been hard to believe in one of the reasons okay, I’m gonna ask a question for you. There is one specific feature in the early versions of iOS that wasn’t there. And because of that, a ton of people. It made it really hard for them to believe that someone could run an entire business off of an iPhone. Do you remember what that was? Was it copy and paste, copy and paste? Do you remember that and it was really painful, right? Like, and you just thought, man, it’s gonna be so hard to do anything, any sort of complex cross app workflow is going to be super difficult. Because of that. That’s not a perfect analogy. But it gives me the, as I thought back to that experience, and how visceral It was to be, you know, interacting with a piece of technology, where you could both sort of sense the immense change in the way that you are going to interact with stuff. But then also these severe limitations that made it hard to imagine how ubiquitous it would be, feels very similar to LLM ‘s at this point, right? Because it’s like, holy cow. These things are capable of doing some completely wild things. But it’s like, ooh, they can’t you know, it’s like that copy, paste, you know, functionality. There are some limitations that feel similar. So, John, thoughts. Is that accurate? Am I off? Do you agree?

John Wessel 06:07
I think the interesting thing will be, because it’s easy to look back. And like, ends. I mean, you take a lot of things for granted. So I’m interested in what happened, like, with iPhones to not face limitations with like, 4g and 5g, like, we were able to keep up the speed because you know, your original iPhone, I mean, the reader you like if 3g? 3g, like, you would not have had the same adoption, right? Yep. Foreign 5g. So like, what are so with MLMs? What are those like companion things? That would be your like, 3g, 4g, 5g? Yeah, iterations that also have to happen for it to hit that same trajectory? Yep. And then even to like those, like, kind of stellar apps, whether it’s like app categories, like whatever the first big game was that blew open gainer, or was the first, like a YouTube type thing that was like a perfect fit. Like, I don’t even think I have a good idea for what those things are yet. Yeah,

Eric Dodds 07:10
for sure. That was Yeah, I mean, yeah, there. I’m sure there was one way earlier in this that it was very the pokemon go phenomenon, made the whole thing so visible all around you, because all of a sudden, there’s these people walking around holding round, right, and you’re like, Whoa, this is a totally different experience than what was possible before. On a number of levels. Yeah,

Matthew Kelliher-Gibson 07:34
I mean, I think we’re still kind of trying to figure out a lot of that, especially on the use case side, where I mean, I think the fact that it was so successful, and that consumer side with the chat interface has really locked a lot of people into they just think about in that chat interface. Yep. Right. And, and I think about it as being very visible in the very like, right in the customer space. And I think that’s pushed a lot of this towards that idea of like printing, you know, the whole joke, like aI powered, like, it’s got to be out there. It’s got to be front and center. Which then makes me wonder a little bit like, is this, you know, are these use cases real? Or is this like saying Bluetooth enabled? Yeah, it’s Bluetooth enabled. Okay, but I don’t need Bluetooth in my microwave. Yeah. That is actually

John Wessel 08:27
another technology. That’s a perfect like, antithesis to the iPhone, where it’s like Bluetooth. We’re on version five point something now, I think. What happened to Bluetooth? Like, like, who has a seamless headphone experience here? Like every day? It’s always perfect. Yeah. So it is interesting, like, follow that curve. More that I

Matthew Kelliher-Gibson 08:47
like, the range has gotten better on that on the Bluetooth. It’s, well, yeah, it’s one of the more recent ones, it’s gotten really good. But you’re still dealing with the idea of like, it’ll still drop several times won’t connect, you know, those types of things. And so I just with a lot of the generative AI stuff, I’m still kind of, you know, we’re, we’re still in the state of it can do everything, or it can it’ll change everything. I mean, even you know, like, I, I saw something today where someone made the statement well, as AI, since AI makes software development so much cheaper and faster. And I’m like, Whoa, do we have any proof of that? Like, I hear that stated a lot, but I don’t know that anyone actually saw that or results of that. Yep. So I think they’re, you know, they’re definitely gonna be use cases that are going to be big and powerful and impactful, but we’re still kind of in this fog of war almost. Yeah.

Eric Dodds 09:42
Yeah.

John Wessel 09:42
I think the other bizarre thing here too, conceptually, is AI is so broad and even, we’re just talking LLVM to be just so broad, and I can’t think of another, like innovation, like cloud computing. It’s the closest I can come with that is just so done. Eric Yeah, as far as the possible things, whereas again, like back to the iPhone, like that had a use case to begin with it had a, you can text and call your friends, you can browse the web, you can determine the GPS use case like you can use just really clear defined use cases, and LLM did not have not come out of the gate with

Matthew Kelliher-Gibson 10:19
that. And that had a roadmap for years before people knew that like GPS we’re gonna get cheap enough that you could put them into bone. Yep, right. Yep. So that when cameras are so otherwise, yeah. So like those had very clear roadmaps, where the whole industry, see where it was going. And then it finally kind of culminated there. We don’t really have that there’s really a very good like industry roadmap of where this is going like

John Wessel 10:43
20 or 30 years worth of like, the first one was the first GPS one was the first like portable camera digital camera. Like it combines so many different so many songs. Yeah, that great moment where it feels like LLM is different because it’s broader and like, it’s net new. Just like there’s not, you know, any anything. I mean, there’s obviously a lot of work before it, but it’s sure it’s different from the common person like it kind of feels new, right?

Eric Dodds 11:12
Do you think so? Yeah, I think about Alexa, and I haven’t looked at the recent data on Alexa. But I remember that at one point, Amazon had an unreal number of people working on Alexa, I think it was 10,000. Yeah, people working on Alexa. And one of the interesting revelations at that time was that people did two or three things with Alexa. Right? Yeah, they get used to the weather. Scores. Yeah, yeah. Turn the lights on and off. Maybe it’s a smart timer to a lot of people, right. Yeah, that’s probably no, totally. And so. But to some extent, and you think about it, I go back to the iPhone. And the other reason that it stuck out to me, was that a lot of people, it was a cooler way to watch a YouTube video or like make calls or whatever, right? I mean, of course, they’re like, you know, the video calling very early on was pretty wild. From the phone. But, that changed a lot over time. Right? Right. Where it’s hard. You think of this thing as sort of very, you think of a technology or interact with a technology in a one dimensional way? Because you’re so used to doing things other ways, or it’s so new, right? I mean, this Google search is actually similar in that, you know, when Google search first rolled out, it wasn’t intuitive that you would just Google everything, right? I mean, oh, well, this is it, even if someone says, you can search, you know, most of the internet through this portal, you’re like, Oh, that’s amazing. And then you’re, like, you know, if you’ve never seen it before, like,

John Wessel 12:54
there’s a sense of what’s on the internet. And I think that’s a major problem with Al. Most people do not have an intuitive sense of what it could answer. Right, exactly.

Eric Dodds 13:03
Yeah. Okay, so we’ve been mildly, you know, we’ve been mildly skeptical in terms of some of the issues here. But these things are incredibly powerful, and I think will change a lot of things fundamentally. So what’s the bullish outlook? Right? If we remove all of those, right, if we sort of look at things like, you know, if we think about things like Google search things like the iPhone, or ever, like drastically expanded in the scope of what people use them for over time, what types of things you see for MLMs? One other thought back to your iPhone analogy. If you do you remember the big thing that people were walking around saying that were like, I’m not gonna get an iPhone, do remember what it was like,

John Wessel 13:48
I could never give up on this thing. I could never give it up. I can never give up using this function on my phone. The iPhone doesn’t Oh,

Eric Dodds 13:55
hold on. The key data, like the number pads

John Wessel 13:59
are physical things. So it’s like the BlackBerry physical keyboards, I remember lots of conversations with people, I cannot give up the physical keyboard. So in a lot of ways that iPhone knew that and they must have obviously known that, but they bet against it. And that like, oh, the video and the scrolling, like if in their studies on this, like how people use their phones, the time scrolling videos and things like that is way more than typing. Yeah. So it’ll be interesting with MLMs, which is primarily typing. We’re back to that like yeah, that’s fascinating. From a user input is what’s going to happen there yep. Yep. But the boat is so bullish . It would be like when we get into multimodal or more users which like the audio is pretty cool. Pretty awesome. Yeah, more video things were like you’re using your camera and there’s a few that are out there. Like I feel like that multimodal will be part of it.

Matthew Kelliher-Gibson 14:54
I think like for me personally, I already start using it where there’s a lot of situations, you know, As a search engine, right, where it’s like, I just need, I don’t want to scroll through a bunch of websites, I don’t need a bunch of bad SEO content that I need to go through. I just need to know, like, how do I do this? Or what’s the recipe for this? Or give me a summary of whatever? Yep. I think, you know, there’s some drawbacks that are really getting the full picture and things like that. But I think for a lot of general cases, that super is just very quick, very powerful and gives you the ability to probably put that in more places. And, you know, and just search in general, probably just to have that. I think about other things that could be used for I mean, we already see it being used in a lot of customer service type areas. Yeah. And I think that probably another one where it’ll continue even like, hoping some of the like, progress one, two, whatever. Yeah. Terrible, which is like, please tell me what it is you want. Yep. And they never screamed at it. Yeah. Talk to a representative.

Eric Dodds 15:56
Right. So think just keep pressing zero until Yeah, exactly. We’re gonna connect you to the representative. Yeah. So

Matthew Kelliher-Gibson 16:01
I think those are some of them, that will be really good. But I also think kind of what you were saying, also, I mean, there’s a lot of stuff where, like, the transcription, ability on things like that, that I don’t think is fully, like the capabilities, I don’t think are fully like being utilized or understood. Yes, we need Yeah.

Eric Dodds 16:18
The unstructured data thing, I think, is really phenomenal. That’s yeah, it is, you know, the ability to translate across languages. I mean, it’s really like, it’s going to be more amazing than I think a lot of things that we’ve seen previously that have, you know, translation type tasks or whatever. But then also, I mean, I go back, I think I’ve mentioned this before, but just processing unstructured data, right, like looking through a 1400 page PDF manual for something I write instructions for any sort of product that you buy that, you know, has some sort of, like summarizing all that. I mean, it’s like the time savings, I think, is going to be incredible across a number of different vendors.

John Wessel 17:03
I really like yeah, the translation is a good one. You do use the word contextualization. Hmm, because now, like all of these things that historically you’ve had generically, like all instructions are generic for just about anything, right? So intentionally, like intentional, yeah, yeah. So you have got an actually, some companies are getting better at this, like having a little variable where it puts your account key in there, instead of like brackets around your account key here, things like that, like are actually helpful. But with an LLM, if you’re generating that just on the fly, you can do some amazingly good instructions to make, let’s just say software setup much easier, because it’s fully contextualized for your account for your region. I mean, I configure software all the time. So you kind of go look up like you’re a tenant. I’d like what region? Is it in? What’s my account? This was my account. Yeah, so I mean, that’s a fairly simple example. But I think things like that, like with contextualization will be a lot

Matthew Kelliher-Gibson 18:00
better. Yeah. And I know people who use it for things like being able to get, you know, government grants and stuff like being able to read the grant proposal. Right, yeah, totally types of things where, you know, maybe it’s not the final draft, but it’s getting you 80% of the way there. And now you’re just reading through it and double checking, or even doing the check on the back end of it of have we completed and are we fully within what the instructions are that we should be doing

Eric Dodds 18:28
with Yes, yep. Let’s, okay, I want to move into single player versus multiplayer, and then get an even more specific map by talking through your experience doing a bunch of prototyping. Because a lot of the stuff that we’ve talked about, and even the examples that we’ve used, and to your point, it’s hard to think of, you know, it’s hard to think of things, right, which is often true of new technology, right? Hard to look back in history and find the exact analogy, right? But a lot of the things we’ve talked about are very individual, right? And when an example of the iPhone is a device that you have, right, you can connect with other people through it, right. But Alexa,

John Wessel 19:06
which was the iPhone, we completely skipped over social networking. Yeah, no big deal. That was a huge thing. Like GPS is very much like, has, you know, outside components of your positioning, like there’s a lot of external networking, network effects stuff that I don’t see, you know, alums have, like right now. Yeah, typically such

Eric Dodds 19:24
a good point. Yeah. When we go to multiplayer, though, and I guess the point I was making was around utility, right. A lot of the things we’ve talked about are personal utility. So even though there are networking effects related to the GPS and maps, you’re still consuming that as an end consumer. Right. Right. And I think one of the interesting things that and would love to hear more of what you’re thinking, John is, how do you think about this sort of single player to multiplayer in the context of harnessing this power as a business and distributing it because of the I think one Other things that makes this conversation so tricky is the revenue from chat GPT, like just subscription revenue is in the billions, right? And so you look at that. And it’s kind of mind boggling, right? But then when you really dig in and talk to companies who are trying to actually productize this, it’s Alexa. Right? There are a couple of things that are working really well and that are really powerful. But they’re not hundreds of them. Right? And

Matthew Kelliher-Gibson 20:27
they’re not necessarily like, I mean, you know, like with Alexa, it’s like, Oh, it does timing really well, with your voice. Well, that’s not really going to pay for itself. Yeah. That cost element is a big, like, elephant in the room for a lot of these cases.

Eric Dodds 20:45
Okay, so John, that was the longest lead in what a horrible rambling lead in for me to say what like, tell us about single player versus multiple? Yeah, well,

John Wessel 20:55
there’s actually two aspects of it. One, I didn’t consider the first one was what we were just talking about, actually like a, I use this LLM with you multiplayer, which is not what I was referring to. But interesting to think about is like, you know, like a multiplayer game, or like a video call. There’s two requests, or multiple of us. I don’t know of any multiplayer in that sense. Oh, thanks. Yeah. But I was more thinking about multiplayer as I use it on my laptop and go into production. So we’ll stick with the other ones that are really interesting. Yeah. So maybe that’ll have some takes on it. But on the production isation. I mean, we had an awesome conversation with Barry from Hex yet on this couple episodes ago. But I think it’s a fundamental problem of like, if you’re a developer, or have worked with developers this, like, it worked for me, like on my laptop, I got it to work. And then the production application process has gotten far easier in the last 15- 15 years. There’s a lot of tooling around it, there’s a lot of best practices around it. Most of that’s missing right from MLMs. And there is the like, attended versus unattended problem. When I’m here using GPT. To look up like a restaurant to go to tonight. It’s an attended thing. And I’m constantly saying, No, I want it closer to my home. No, I don’t like Chinese food, or I don’t like Italian food, like you’re and then then you can kind of get into it’s cool. It works. But if you’re doing more of an unintended thing and production where like, you want a one shot, answer two acts like yeah, and late, you know, X to Y, or whatever you’re trying to do. It’s different. Yep. So people have to think differently because they expect deterministic code, yes. And production, that I have very high expectations from people. And if it’s not in that like iterative, like interaction phase, which just feels weird, like, we’re just not used to it and soft, we’re not used to the software being like a little unsure of what Yeah, but like kind of

Eric Dodds 22:56
going to the attendant versus unattended, I think is really interesting. And one of the things in the conversation with Barry that I appreciated so much was that he said, he basically didn’t use those terms, but he described their implementation as being attended. And so for the listeners who didn’t catch that episode number one, unbelievable episode, Gary is an incredible guy. So go back and listen. But for those of you unfamiliar with Barry McArdle, or Hex, Hex is a collaborative data platform. So analytics, data science, you know, workflows reporting, and they have a feature called Magic that, you know, you can essentially ask it to do things or ask questions in plain English, and then it will generate code and run that code and even give you multiple cells of results. And it works really well. But it’s not perfect. And Barry was like, this isn’t perfect. But you have to think about the context. Where we are like where our user is, right? It is someone who’s writing who writes code, you know, as we’re living or as a major part of their job. Right. So SQL Python, you know, What, did they support multiple languages? I think they support our don’t think, I think they might, yeah, which I was so good at. I appreciate that respect for simplifying our get your matrix mounted there. But he said, you know, our end users aren’t perfect, right? And so it maps to their experience, and that it is an attended problem, John, I think that’s a great way to describe it, right? They are, if they write code and execute it, they’re going to analyze it and evaluate it for its accuracy and all that before. Yeah. And so it’s an attended problem.

Matthew Kelliher-Gibson 24:33
It’s also one where it’s generating code, but you’re running the code over and over again, production, you’re not asking it to write code every time you do it. Correct. And that that is one thing that that a lot of stuff that that I’ve done RudderStack that you run into is, you know, there’s a couple things that I feel like are a blessing and a curse to their current implementation. One of them is the fact that if you have you know, when you try to work on something on a larger scale with it, anyone can go. Well, I just went to jet GPT. And it did this for me. Why can’t you make it do that? Yeah, like, that’s a classic. I can make you do it once. But can you make it do that consistently? A million times? Yeah. It’s really hard to do. Yeah. When

John Wessel 25:17
I think that attending implementation, like one of the interesting things to think about, and this wouldn’t be data legal, like a lot of applications, I think it can be really wonderful to end a workflow with an expert. Like me, you can imagine being a lawyer and reviewing all the results. And like, you’ve got these, like, nice summaries of these things. Same with data. You’re already like, you know, Python and SQL really well, yeah. But like, you know, you’re saving keystrokes. Like you’re moving faster, like this makes a ton of sense. Yeah. And you’re an expert, and you have so many wraps, you can look at it, I’m like, Yeah, that’s a problem. Yeah. And you can run it and get an error from a deterministic system and fix the air. Like all that works. Really, I don’t think there’s any problem with it. Yeah, yeah. Well,

Eric Dodds 25:59
There’s a company, is it RV? I’m gonna look this up. That is illegal. Its LLM is legal. And they just raised a bunch of money, which makes a ton of sense. Yeah. Series C announcement.

John Wessel 26:14
But the trouble here is legal for, like, financial tax. Yep. Like the, into it. The Desire, oh, we gotta get these in the hands of the people. This is going to be so great. Right. Yeah. Which at some point, that’s probably true. But especially in and then think medical, right, like, even more scary. But we’re, I just don’t think we’re at a point where that works. What Right, yeah,

Matthew Kelliher-Gibson 26:38
I think also, it’s a situation where we’re saying, Well, if you have an expert, this is really useful. But there’s a temptation to either not give it to an expert, or for the expert to get kind of complacent or possibly lazy, like we’ve seen with people filing law briefs that were completely made up. And that becomes a question of, will it hold up? Or will it just cause people to yet will like the quality go down because people stop learning or trying to keep up with that? You know, if you’re already an expert, it’s great. You know, it’s kind of like the copilot stuff, right? If you’re, if you already know how to write code really well, it’ll help you a lot. Yep. If you don’t, if you’re just starting, it’ll help you. But if you want to get really good at it, you need to also code without having a co pilot there to help you. Right? Or you will become completely dependent on that copilot. And we’ll stunt your ability in the long run. Yeah, it

Eric Dodds 27:37
is interesting. Yeah, I pulled up Harvey, which seems like a really neat company. Yeah. But the Harvey platform, just reading from your website, the Harvey platform provides a suite of products tailored to lawyers and law firms across all practice areas and workflows. What’s interesting is that it is designed with humans in the loop. And I think that’s where a lot of the aspirations, the aspirational thinking comes in, around what types of things will be able to happen at scale without a human in the loop? Right, because almost all of the use cases that we just talked about, are attended in an expert context, right? Where someone who’s knowledgeable enough to know how to use the output, even if it’s not 100%. Accurate. Right. And that seems to be like a big, that seems to be a big challenge. Now, I will say, I think on the customer support side of things. That’s gotten really good. I mean, there are a lot of things you can do without a human in the loop. Yeah, you know, so that’s at least one area where it has proven to be to create like, really non trivial gains in terms of

John Wessel 28:48
when I think with the human loop to like, it doesn’t have to be a certain spot in the loop, right? Like you the human and loop could be all the way at the end, and determined like, hey, that result is not right. And they could have lived but there could be some kind of like Python or sequel or something being generated in the middle. And that can still work. It’s just, you know, there’s just a certain way, if you’re in the loop, and you’re the expert in Python and SQL, it’s easier to tweet whether you’re down all the way downstream looking at end results. Maybe that’s not right. You can give feedback to the LM and still probably eventually get what you want. It’s just a little harder yet because you’re further downstream.

Eric Dodds 29:27
Yep. Matt, you’ve been doing a ton of prototyping at RudderStack. And I’ve been, I want to say helping but maybe I’ve been slow. You’re smiling, not calm. I wish the listeners could see a bit of space. Yeah, could CSAs I appreciate that. I appreciate you withholding information. Yes. So one of the things I’d like to talk about first is That, which I’m personally very excited about just going through these workflows with you, is on the infrastructure side of things. So I’ll give the readers a little bit of context here, but RudderStack deals with customer data, right? So a lot of the use cases that we help our customers with happen on some sort of entity level, right, a lot of users readiness, modeling users, modeling accounts, modeling households, these entities that sort of represent a business’s relationship with their customer. And RudderStack products that help with that modeling are called profiles. And so it helps solve identity resolution, and then makes it really easy to compute features over that identity graph at an entity level. And so the output of that generally is a set of tables, or sometimes a single table that our customers would call their customer 360 table that sort of, you know, one row per user entity, whatever, yep. And then everything they know about them. And so you’ve actually been using some early versions of LLM infrastructure that we built into that pipeline. And what’s been interesting to see for me is that you’re actually spending most of your time, like working on the MLMs, not doing the data engineering side. So can you speak to that a little bit, because like, the input and output side of that been fascinating?

Matthew Kelliher-Gibson 31:25
Yeah, well, so gotta give credit to the engineering team here, because they’ve done a great job of just hitting that right into the kind of profiles or flow so that you’re not having to deal with any of that stuff there. And then also, you know, we were building this off of snowflakes, cortex, all that type of stuff, which makes it a lot simpler to go and actually, you know, pull up, you know, have a model you’re going to be interacting with, and how’s it going to be sent and all that, that a lot of that you don’t have to deal with. So instead, what your ability to do is basically, to take that customer 360, and then be able to say, Okay, I’m going to ask you to do whatever task it is, here are the variables or the features, as we call them from the 360. And you can, you know, inject them directly into the prompt. And you can also go get other static data that you want to use. So you can inject at a user level, you inject at a user level, and you can also get static data from anywhere in your warehouse with just a SQL query that you owe, because you can just use a sequel model, right? And that becomes the thing so that you know, when you’ve got a constant, you know, like a list that you wanted to pull from or that you wanted to be able to guess from, you can pull that in, right, it’s very simple to do within the configuration of how profiles works. So then what you’re really focused on is just the engineering prompting at that point. Yep. How do I get it to do the thing I want it to do? Yep. How do I get it to, you know, put it in the right format? Not give me an explanation afterwards, not tell me what a good idea this is, or whatever, it’s gonna stick? Right? Yeah. So you’re really just focused on that park there, which is one of those that like, you know, when you first when I first was getting it in doing it, you’re getting frustrated with the engineering prompt, and then you realize, and I think you asked me a question. And I was like, oh, no, yeah, they did a great job that all worked seamlessly getting the stupid thing to tell me, give me what I wanted to get. Well, okay,

Eric Dodds 33:22
but then talk about the output as well, because that was something I mean, you and I just sort of, we can talk about the specific use case. But you and I just wired up a pipeline to, like, put a bunch of this output at a user level into one of the marketing teams tools at RudderStack. Right?

Matthew Kelliher-Gibson 33:35
Yeah, at night. So what it ends up doing is it creates its own table from this that gives you that gives you the like universal ID that the profiles creates, and the results that you’re looking for

Eric Dodds 33:48
the end. So you basically have an entity ID and then at the end of the LLM response, get a user level seeing user level. Yeah, yeah. super interesting. facet. See? Yeah, so it’s streamlined, streamlined, a significant amount of the data engineering piece, because that’s a brutal thing with customer data, right, is that? You know, like you said, making it work in GPT. You know, it’s just like, sort of uploading the data at one time. Seems magical, but it’s actually very difficult to do that at scale across like, you know, yeah, 10s Hundreds, well, like 1000s of users for

Matthew Kelliher-Gibson 34:24
For comparison sake, I have also downloaded, you know, my open source models from, like, hugging face. And it is a fight to get those to work just like Python codes. versus you know, I’m dealing with one now that ‘s like, I’ve spent a day or two and I could not, you’re fighting this problem, you don’t have this package. It’s not working the right way, or you don’t have the right key access token, whatever it is, okay, why is this giving me an error versus a look? It’s a YAML file. I wrote down this stuff. I ran it, it ran

Eric Dodds 34:57
Yeah, cuz you’re just interacting with the API. It’s baked in, right?

Matthew Kelliher-Gibson 35:00
It’s all baked in there. Yeah,

Eric Dodds 35:02
yeah. Okay, let’s talk about, let’s get really specific and talk about one of the use cases that we selfishly built for my former marketing self, which was, and you can kind of get a sense of why Matt did not answer this question here in terms of me, hoping, you know, these LLM projects. But one of the interesting things that we talked about a lot on the marketing side of things was that we had a very deep library of really good content. And we know from just our basic analytics, that if people are exposed to a certain amount of that, in a certain time period, they’re much more likely to convert it to try our product or request a demo or whatever, right, which, which makes all the sense in the world, because they’re sort of getting, you know, it’s there. They are being educated about all these things that are really important about our product, or whatever. And so, but what’s difficult, like content discovery is very hard, right? Especially on a b2b marketing website, how do you expose that, etc. And so, and it also is highly contextual, right? We have a very deep integration library, we have different warehouses that we support all of this stuff on. And so people generally want information that’s related to the domain of their stack, right, and the things that they can do, and the features that are highly relevant to sort of this use case, this set of particular technologies. And so if you think about trying to expose content, that’s actually a pretty difficult problem to solve, right, not unsolvable. Like, this has been solved in the machine learning space for a long time. You know, the team is me and Matt, an engineer, right?

Matthew Kelliher-Gibson 36:42
And you also just need a lot of training data. Usually, for those that have tons of training. It’s not as simple. And we’re just going to wire it up, and it’ll go type deal. Right? So

Eric Dodds 36:52
Matt, and I thought, Okay, well, one of the difficult things actually, is getting recent browsing behavior on a user level and sort of packaging that up. But we realized, with profiles, we already have that baked in as a feature, right? So I don’t know what I mean, the last 10, pageviews, or over some time period or something. What did we actually end up doing?

Matthew Kelliher-Gibson 37:14
It was the last five pages you visited. Okay, last five,

Eric Dodds 37:17
five pages you’ve visited, see how helpful I am? Yeah, on AI projects. It’s your face. And so this is actually something that I, that was really interesting to me. So walk us through prototype one, and then prototype two, because we use two different architectures. We ran all of this on snowflakes cortex, which was pretty, pretty cool. But prototype one and prototype two.

Matthew Kelliher-Gibson 37:42
So prototype one was we were going to use an LLM to pick the content based off your last five Yes,

Eric Dodds 37:48
pages, also known internally as the Yolo prompt engineering methodology.

Matthew Kelliher-Gibson 37:55
Yeah, we’ll get to that. So. So we started this, basically, like I started in chat GPT just trying to see okay, could we get it to output something? Okay, yes, we can get it to output. Yep, I can get it to output in like a JSON format. And where I wanted it cool. We then got all of our web content, we actually put it into a table and Snowflake to make it easier to handle but of course, you’re dealing with context windows, which is one of the things you’ve got to pay attention to. And

Eric Dodds 38:22
I’m fairly certain most of our users are familiar with this. But for those that aren’t, could you explain the context window for anyone who wants to type in? This is just a really good sort of single player to multiplayer thing, like, you hit a context wonder if you go the yellow prompt route?

Matthew Kelliher-Gibson 38:40
Yeah. So basically, any prompt you write for an LLM? You know, those words are broken up into tokens. And so for every, you only have so many tokens that you can put into the prompt before it gets too large. And the model itself cannot hold that much. Kind of in its memory, so to speak. Different models have different context windows. For example, a lot of the ones that are available in the cortex have about 8000 tokens. We were using one that had 32,000 tokens, because we needed every one of those 32,000 tokens. There’s another one that has like 100,000, we didn’t go that far, because all these things cost differently. And most of it is based on your computer time. So you want to try to keep that in mind as you’re doing it. I do get a talking to at one point when they’re like, why is our Snowflake Bill spiking last week?

Eric Dodds 39:29
Remember that?

Matthew Kelliher-Gibson 39:31
Oops, I actually was helpful in resolving that. So anyways, we kind of did a little bit of a proof of it. But once we got the web content down, I mean, it was never going to fit into a 32 token window, right. So because I mean, if you go from words to tokens that Nick actually asked Chet GPT that’s like 1.4 tokens per word is like a rough estimate you can use. So what I ended up doing actually, with one of their MLMs, was asking you to make summaries of all of our content. So I made another table that had, I told it to create three keywords and had to give it some stuff like don’t use RudderStack is one of the keys because it really has to do that. And then to summarize it down to like under 35 words that we get down there. At that point we still had too much. So we just basically lowered it to only blog posts. Yep. Yeah. To kind of subset the blog post. Yeah. But so it was. So it was like a curated list of the Most High End 10 best performing stuff? Sure, let’s say, No, it was. So we started with that part there. And then it was the hard part became just, you’re writing a prompt that basically says, we have this prospective customer, they you know, and we were putting in things where we were saying they work in this industry that at a company that has this many employees, there was one other thing I can think of in their job title, maybe was something like job title. And then we said, here’s the last five paid up to five pages, because some of them it wasn’t completely five, you know, and produce a recommendation, that type of thing that is kind of like getting to that was a little bit of work. But okay, we can do it. The hard part came once we started running it like that, because of the nature of it, it’s a little hard to kind of do a one off test; you have to kind of run the whole pipeline to do it. Yep. And then we had to go to a subset of users because yeah, yeah, we had a subset of them. And it’s still kind of one of those, you submit the job, you wait for it to finish, you go look at what it is. And we ran into that’s where we kind of ran into a lot of our stuff. Was it loved to pick really generic content? Yeah. Which was super annoying. So that was one of the other reasons why we went with just blog posts, because when you would give it just website context, it would be like, Oh, the integrations page. Yeah, that’s not super helpful. Yeah. And you can, I was changing the instructions on it saying, like, make them specific, you know, I tried doing stuff around, like, think of three topics this person would like, and now match it to this list that I’m giving you, or it has key words, and a title and a summary. And it lives like a RudderStack homepage. Yeah. Okay. So that was like probably the biggest fight within it was trying to get it to do that, and trying to get it to do it in a way that a marketer could look at it and say, yeah, those make sense. Yep. Because I remember I showed you the first couple of months where I’m like, I don’t think this is working well. And you’re like, Yeah, I would not use the Yeah, yep. So that was the big headache I was going through. I tried different models, and we worked with the prompt a bit. And it’s tough, because we were maxing out that context window. And all these MLMs get a little fuzzy as you are in the middle there, as you X out the content. When we got to two points as we were getting worse, I felt it was getting better. And then I was looking through it and realizing it was making up titles and URLs. Yes. Yeah. Yeah, you give it a list. And it was still making it up. So we ended up putting an ID number on there. And I’m like, Alright, the very least, you’re going to return me this ID number. And I’m going to use the master list to make it match up. And so that ended up being the solution that we had for it. I mean, the other problem we had was, I needed it and I wanted it to be in a JSON format. Yes. And it’s one of those things we talked about. It’s really simple. You can look at it once. But when you run the same thing, 900 times, there is a percentage of them where they go, here is your JSON formatted. Yeah, no. Yeah. Boy, they put it at the end. And they’re like, here’s why we recommended these. Yep. No, I said, Don’t do any of that. Why are you doing it right? Yes. Right. So when we got to the point where we’re like, Okay, this is good enough. We feel this is good enough. We use this. The project ran fine. What I ended up having to do was have a post, will you do it as a post hook? I ended up just doing it as a view within Snowflake. Yep. To clean it up that just cleaned it up, matched it back to the original data and then just reformatted it the way I wanted it for me,

Eric Dodds 44:07
yep. Now, okay, so that was a fascinating experience, for a number of reasons. But aside from the, you know, aside from the prompt craziness, once we got it to a place where it was in by we, I mean you Yeah, once that’s implied. It’s the royal we Yes, the royal way. But once I like it was pretty amazing. Just like hook a RudderStack Reverse ETL pipe up to the clean up view that you had. And the reason we wanted JSON actually, which is just bonus points, you know, for any listeners who don’t have to deliver data down to stakeholders. The reason our marketing team wanted it in JSON was because they were going to use liquid tags in their email tool. And so for anyone unfamiliar with liquid tags, and like an ESP i.e. if you have JSON objects, you can use liquid in like an email template in the code editor of an email template and pull in content dynamically from a user record, right? So like, you know, if you wanted to say like, Hey, first name, right, and you have the liquid like format that’s like liquid syntax. And that’s pretty common among DSPs. Right? And so we wanted this at a user level in JSON, so that our marketing team could use liquid to just set up a template and read straight from the content recommendation. Right? And it was, I mean, how long did it take us to actually wire that up? And like, get it in the email tool? 30 minutes or something? Yeah.

Matthew Kelliher-Gibson 45:44
And it wouldn’t have taken that long, except I had to restart my browser halfway through and redo part.

Eric Dodds 45:51
Yeah, so that was truly like from a workflow standpoint, I guess like one of my takeaways from that, which I want to get to prototype number two, because that’s even more interesting, especially with the stuff that cortes did, it was pretty nifty. But from a workflow standpoint, going from like, we need the last five pageviews for all of these different users to we’re interacting with an LLM like we’re generating input, we’re interacting with an LLM running a prompt or a series of prompts. We have an output table, we wire that up. And like we actually deliver this output to like user records, and an ESP was pretty dang smooth. But yeah, overall, which is impressive. Well,

John Wessel 46:31
and just from listening to the conversation, it’s always interesting to me, hearing other people’s recounting of projects, right, because what I hear a lot of the time, is that well, we spent the most time on the project around the part that wasn’t the hard part. But this conversation, I think, was not true. I think you actually spent the most amount of time on the part that was the hard part. down what I mean by that is like, we spent so much time trying to get access to this thing. We spent so much time trying to get up. Yes, a pipeline that I thought would be easy to set up. But this and this. So yeah, there’s actually a positive

Matthew Kelliher-Gibson 47:04
The API just works the way I want it to work. Is that super interesting? Yeah.

John Wessel 47:09
Right. So it’s, I think it’s a real positive thing that’s like, okay, like, the dealing with Ellen’s was annoying, like, in a lot of these ways, but like, at least your focused effort was on that thing. That was the right, you know, piece of it. That was going to be difficult. Yeah.

Matthew Kelliher-Gibson 47:22
And even the beginning of it, because we were just using RudderStack web data that came together very Oh, right. Yeah. Right. So that all came together, the, you know, identity stitching, all of that just seamlessly happened for us with that, or Yeah. So it made it very easy to focus on just this kind of novel. Yeah, you’re doing, man,

John Wessel 47:44
John, we’re not even paying you to make that comment about RudderStack. Because you didn’t hear about any of this. Yeah, this isn’t that. Yeah, this isn’t. Right. And that was a wow, I’m just thinking about my experience with projects, like Dan, and I spent most of the time on permissions for this rather than a bunch of time on that API that, yeah, was like broke and created a support ticket. I mean, which is funny, right? I mean, like, even when you think to yourself, because I’ve been prototyping some things, when you think to yourself, like, oh, like, I can just grab that really quick. There is a lot of wisdom in being like, alright, has this already been abstracted by somebody else? Yeah. Like, let me just do that. And then, you know, later on, like, we can go back and address and like, maybe we’ll write that ourselves. Like, it’s yeah, it’s a good thing to use those abstractions when you have them 100% Yeah. And honestly, like, start there. And then when if you have a need, because say, maybe it’s a costly abstraction, look at it backwards, and like, alright, we will work out cost by like, addressing this abstraction, you know, if there’s some kind of ETL tool or something that like, Well, yes. Like, it works so much better that way. Totally. Yeah, I

Eric Dodds 48:48
I totally agree. I’ll buy you lunch at that point.

John Wessel 48:50
All right, thanks.

Eric Dodds 48:51
Thank you. No, I, in general, I really agree that I think that there’s sort of at least from you know, and this is far from statistically significant, but just talking with friends and peers. One of the most difficult parts about this is getting your data to a point where you can fight the LLM to get what you want. It is really time consuming. Like yeah, and I think a ton of companies are like see the potential to do like interesting things, but it’s just so much work to get to a point where you can craft input where you have like, like baseline input that allows you to focus your time to your point is I think it’s a great point to like really work on the hard stuff. Yeah, is

John Wessel 49:33
a nice luxury to have and it’s one of those things where we just don’t know like if GPT 4.5 or five or whatever comes out and all of a sudden it’s like wow, look what they like just did and yeah and like then you’re set up for it versus what we don’t know we don’t know the speed right? That’s yeah,

Eric Dodds 49:51
well, and I mean, to Barry’s point, you know, this is kind of like, let’s just quote bury a bunch of episodes that Herbert. He was like the thing about new models. It’s like some stuff gets better. So yeah, that’s the other challenge, right? If you do have a benchmark, and you have to actually go back and redo and

John Wessel 50:08
There’s a reason for that, right? Because they’re optimizing for two things. Obviously, they’re pushing forward with making it better. But they’re also optimizing for cost. Because they have a lot of pressure. Yeah. So if you wonder, like, Why does it feel like it’s doing both? I think the cost pressure makes it, you know, maybe worse in some areas sometimes. And then obviously, you have your like, move forward future pressure that everybody had totally,

Eric Dodds 50:27
totally. Yeah. And also the yeah, whatever. That’s a separate subject.

Matthew Kelliher-Gibson 50:31
So just hard to evaluate, across everything very hard to evaluate.

Eric Dodds 50:35
And then it does make me think about the billions of dollars of, you know, subscription revenue, just for chat. GPT. Right. Yeah. Right.

Matthew Kelliher-Gibson 50:44
Um, is they have billions of dollars and development costs every year to it’s true

John Wessel 50:48
in Compute. Yeah,

Eric Dodds 50:49
It’s pretty wild. Yeah. Okay. Product, second version of the prototype. So

Matthew Kelliher-Gibson 50:56
the second version of the prototype, which came about, partially because we were talking with Snowflake, and one of their solutions, engineers recommended that we try a more rag approach, which, at first, I was slightly hesitant to, because you know, that’s still pretty new. And I didn’t want to have to manually set it up sure,

Eric Dodds 51:13
when we kind of talked about that. But also, we’re in the prototyping phase, right, which is why the yellow prompt method is like, really helpful for getting an end to end use case running.

Matthew Kelliher-Gibson 51:20
Yeah, I would also say, if you can avoid yellow prompting, like, think about it a little bit before

John Wessel 51:26
and I got an operational definition for YOLO prompting, please, Brooks,

Eric Dodds 51:31
Can you put that in the show notes?

Matthew Kelliher-Gibson 51:34
Anyways, so I went in and started looking at that. And the nice thing was, is that so for a rag, which is just like retrieval, augmented generation or something like that, you basically have to take whatever specific knowledge you want it to be drawing from, and put it into a vector space, or, you know, an embedding or whatever. So, you know, artificial multi dimensional space, you’re gonna put it into, the nice thing is that Snowflake through cortex has two functions that will do that and put it into a stable embedding space for you. So the thing if you’ve ever done anything with any kind of text embedding, in the last 10 years a lot of them were very much that the embeddings changed, as you added more to them. Yep. So you know, it was one of those like, Oh, we’ve got this stable state. But if we add in these 100 more rows, it’s going to change everything. Yep. You don’t have to worry about that with these. So that made it very simple. Because then all I had to do was I added a row to our table that had all of our content with all the summaries and keywords, right. And I just embedded the title, keyword summary, boom, yep, no, that’s done. Then I went back to do a little bit of cleanup on how we were, how we had done our browsing history. Yep. Just because there was, you know, if it was a sub page, it would say, like the name of the page, and then we’d have like

Eric Dodds 52:55
a pipe. Right stack.

Matthew Kelliher-Gibson 52:56
So right, because it is pulling from the like, right? Yeah, that’s yeah. So you’d get, you know, five results. And they’d all say RudderStack. And that when you’re going to be doing it in an embedded space, that’s not helpful. Yep. Yep. So I cleaned that up. And then all I did was I used a couple different distance metrics and ended up just using a straight Euclidean distance measure. Yeah. Which are the functions that Snowflake offers you out of the box? And just saying, Give me the like, three closest, right, measure the distance in this embedding space, give me the three closest, that ended up being better in the end, like the results, it got us the more specific results. Yes, it was looking for things that were similar to the actual web pages you were going to

John Wessel 53:40
and we’re talking like, text similarity, like, here’s this text, right? Yeah, common characters or whatever? Yeah. Well, more or less? Okay,

Matthew Kelliher-Gibson 53:51
we’ll go with that. And that was really interesting, particularly because you don’t actually need the LLM at the end of that. Yes, for this particular use case for this particular use case. Because usually, what you do in a rag is you would take whatever those like top chunks that you pull out, and you would put it into the prompt as context kind of similar to what we were doing by putting in all of our all of our blog content, right summaries, except you would limit it this one, you know, you would in the state of a scenario, you would limit it to just the ones that were relevant, right. So you’re not blowing up the context window, and then you’re feeding it with, you know, sort of a curated rating. And the hope is that you’re not going to have it hallucinate when you do that, right, basically giving it the information that you’re pulling from any kind of summarizes it. Well, in our case with this specific use case. That would have made no sense, right. We

Eric Dodds 54:44
were just trying to serve. Yeah, totally. It is actually pretty. The results are pretty awesome, though, because when we were talking about the use case before, you know, we were saying okay, this is pretty specific to someone’s particular interest around you know what warehouse, are they running? Or where integrations are all that sort of stuff, right? And so if you take their browsing history, generally in the last five pageviews, they’ve looked at something that’s pretty relevant. And so when you do when you pull in the, when you use a distance function to pull in related stuff, it tends to be like, really relevant. Right, right. And stuff that could get this could be, again, we’re talking about like a curated subset of blog posts, right? And so it’s like, okay, they probably wouldn’t look at this content, but it’s highly relevant. Right? So it worked really well. And it’s the type of thing that you know, in a lot of this marketing content, it has like a long tail to it. It’s a lot of very specific stuff. It’s hard to kind of get in front of the right person. Yeah. But if you’ve gone to our Android page, or like Android SDK page, right, it will pull three articles that talk specifically about using Android. Yep. And so that’s one thing that you can get this thing, that would be a very long tail piece of content. Yeah, that would be very hard to figure out who the right user is to target this way. Yep. You can pull it right into them. Yep. Yeah, so it’s pretty cool. I mean, of course, we didn’t, for this particular use case didn’t mean to do any generation, or any sort of generative stuff. But it wouldn’t be difficult to do that. So it was pretty awesome to see that that was actually phenomenally fast. And

Matthew Kelliher-Gibson 56:27
I will say that I do think that using the like, the vector databases, and the embeddings is an area where I think we’ll see more stuff come out of without necessarily the generative side on top of it, or we kind of as talking earlier, John called it LLM icing on top. I

Eric Dodds 56:46
I agree. I mean, that is kind of an interesting, positive consequence of this, right that even if you think about things like vector based search, for example, as opposed to just standard index search, where you don’t necessarily have to have a generative step at the end of that for it to be way, way better, right, then like standard indexer. Yeah,

Matthew Kelliher-Gibson 57:09
and a lot of people it’s not like we talked about Gen AI. And so those two components kind of get slammed together. Yep. There’s all of this text, or chunks or tokens, or whatever, that’s in this vector database that’s holding its position. Like, that’s where you get this semantic knowledge from? Yeah, but yep. And then there’s this deep, you know, this deep learning net that’s on it. That’s basically the thing that’s generating it. And the one you know, that’s where that magic, so to speak, happens that gets you that generation, but in a lot of use cases, you don’t necessarily need that. Yep. And you know, just another plug for a kind of Snowflake and profiles there. You can put text embeddings. Like you could do that directly within a profile project. Yeah, yeah. Oh, yeah. That’s an interesting feature that, you know, you could just make it as a feature, because it’s just a function call. Right? Once you got it all. All aggregated, you can do that. And yeah, permanently set with it. That’s

Eric Dodds 58:03
super interesting. Yeah. So you essentially, because yeah, entities and profiles are agnostic. So you could actually create a, like an entity for the embeddings. You don’t even

Matthew Kelliher-Gibson 58:14
need to create an entity for the embeddings. Like, you know, you could take our existing profile project, we have a feature. That is your, you know, your browsing history. Oh, god. Okay, so you’re talking, yes, make another feature. And you feed that in Darshan. And now within there, now it’s in your kind of either customer, 360, or in another table associated with it? Yep. I have the text embeddings of my history, what your browser history was, and now I can use it wherever I want it. Yep. Cool.

John Wessel 58:40
So you talked about, like, your context. Window limitations? Yeah. Have you thought of other ideas for the customer? 360 space, where maybe the result is Boolean. It’s like, some kind of feature and that’s a one or zero? Like, you know, like, his new customer would be an example, obviously, that’s an easy answer. But yeah, the thought of, like, is the customer currently angry? one or zero? Do some semantic analysis like, or do you think there’s applications like that, where the context window maybe, is less of a thing? Well, it would be for a

Matthew Kelliher-Gibson 59:15
long blog post. So I think what you’re talking about, like just using LLM in your like 360 Booba is a little Yeah, so it’s a different one because what really blew up our context window was that static blog posts. But like 3000 5000 Yeah, but like, what I think is something that can be super useful within that is to be able to do those things because you can bring in all that information if you can, you know, you could you can look up two or 360 like support tickets and then be able to have the LLM summarize, was this you know, how did this customer enter how to discuss them or exit those types of things? Were they happy, sad? Yeah, sure. Angry, confused,

John Wessel 59:54
whatever. Like most recent interactions, insert a motion, right? Like yeah, I

Eric Dodds 59:59
didn’t talk about How’d that even that is like as part of the input, right?

Matthew Kelliher-Gibson 1:00:02
Or even have an output? That’s something like, you know, their last interaction with the customer’s success? Did it get resolved? Or was it still hanging out there? Right, right. So you have a lot of those types of things you can do. I mean, we haven’t done this yet. But I mean, I’ve told Eric, I’ve told you this one, I still think there’s things like looking up things like contracts, yep, as an input, right. So you have an ID and you have a contract, and then being able to extract information out of that at a customer level, so that

Eric Dodds 1:00:30
you can add that, like we were talking about inputs earlier, from an LLM standpoint. But this is one of those areas, Matt, you. And I’ve talked about this a ton, right? Where this is arguably, like one of the best possible use cases for an LLM as part of customer data and the pipeline, but it is not sexy, where it’s like, in your customer 360 pipeline, running an LLM to generate a semantic feature over unstructured data and automatically appending that to the user, like each row for each user, each entity is so so annoying to do. Like if you don’t have a streamlined pipeline, right? I mean, it’s just like, really think about a use case where a customer has 1000s of enterprise contracts. And each one of them is customized, like sure you start somewhere. But there’s all these allowances, and there’s like, users and like all sorts of customization. And imagine, like, analyzing each of those and extracting because you know roughly like what things you made allowances on but nobody remembers what you did. Especially if you have like a million products, you know, like Yeah, TN or, you know, whatever, you know,

John Wessel 1:01:37
That would be great. But you could run it through and say I know roughly like we sometimes we made allowances with user count. Sometimes we made allowances with whatever. And you would like to have that fix. Like, there’s only like three, four or five things with our levers, but then to have it run through each of the contracts and pull out like, oh, yeah, allowance here. They have the max 100 user account instead of the default. 75. And then like tagout and Salesforce, yeah, I mean, that’d be super power. Yeah. Yeah,

Matthew Kelliher-Gibson 1:02:03
That’s where I think a lot, especially for customer data, is that unstructured? destructured. Yeah, very good. That’s really good. Yeah.

Eric Dodds 1:02:11
Yeah. It’ll be fascinating. All right. Brooks is giving us the signal. He’s back from leave, obviously, because we’re getting the signal that it’s time to wind it. Boy, the show’s already gone back. The quality has already gone back up. And not just because we had cynical data guys, you know, two times. I know back to back not as two times in a month, and you were positive today. Yeah.

Matthew Kelliher-Gibson 1:02:36
I know. It was weird. Yeah,

Eric Dodds 1:02:37
no. Yeah. Well, let’s get you back into your groove. Yeah, it’s gonna say, we’ll get your alternate ego back. All right. Thanks for joining us. Subscribe if you haven’t, we plan to have more fun shows like this where we get really practical about those Matt and other guests. We’ll catch YOU on the flipside. The Data Stack Show is brought to you by RudderStack. The warehouse native customer data platform RudderStack has purpose built to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com.