Episode 222:

The Future of Data Modeling: Breaking Free from Tables with Best-Selling Author, Joe Reis of Ternary Data

December 31, 2024

This week on The Data Stack Show, John is joined by the Cynical Data Guy (Matt Kelliher-Gibson) as the pair welcomes back Joe Reis to the show. Joe discusses his new book on data modeling and shares insights on the evolution of data modeling practices, emphasizing the need to understand various data types and use cases. The conversation explores challenges in data education, highlighting the limitations of traditional academic programs and the importance of foundational principles over specific tools. The group also delves into hiring practices, stressing the value of character and continuous learning. The episode concludes with reflections on the impact of AI, the future of data education, and more.

Notes:

Highlights from this week’s conversation include:

  • Joe’s Recent Projects and Work (0:55)
  • Joe’s New Book and Inspiration for Writing It (4:39)
  • Challenges in Data Education (7:00)
  • Internal Training Programs (10:02)
  • Creative Problem Solving (17:46)
  • Evaluating Candidates’ Skills (21:18)
  • Market Value and Career Growth (24:03)
  • AI’s Impact on Hiring (27:47)
  • Content Production and Quality (31:56)
  • The Evolution of AI and Data (34:00)
  • Challenges of Automation (36:12)
  • Convergence of Data Fields (40:26)
  • Shortcomings of Relational Models (42:09)
  • Inefficiencies of Poor Data Modeling (47:10)
  • Discussion on Resource Constraints (51:50)
  • The Role of Language Models (53:13)
  • AI in Migration Projects (57:00)
  • Joe’s Teaser for a New Project (59:05)  
  • Final Thoughts and Closing Remarks (1:00:07)

 

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcription:

John Wessel  00:28

All right, welcome back to The Data Stack Show. We’re here today with Joe Reese, a second time guest, Joe, welcome to the show. So bye, guys doing good. Also, Eric is out today, and we’ve got the cynical data guy Matt here as co host,

Matthew Kelliher-Gibson  00:44

just sliding on over from the couch.

John Wessel  00:48

Glad to have you here. So Joe, yeah, catch us up a little bit on what you’ve been up to the last last few months, since we last spoke,

Joe Reis  00:54

not traveling as much, which is good. Yeah. So I’ve been going non-stop globe trotting, which happens in the spring and fall. So I’m just back home in Salt Lake City, working on some projects right now, and that’s about it. It’s just been nice to just, I mean, definitely thankful to travel a lot and see some cool places and meet awesome people, but it’s good to be back for a bit. No,

Matthew Kelliher-Gibson  01:17

Yeah, it sounds really nice. Okay, Joe, we spent a few minutes chatting before the show. I’m excited to dig into a little bit about the book you’re writing, and just maybe get into some cynical takes on what you’re seeing out there in the data world.

Joe Reis  01:34

Yes, well, yeah, I don’t think it’s any secret I’m working on a new book right now. It’s on data modeling, and I can get into why that is, but what the book is about is it’s it’s an end to end treatment of data modeling across different use cases, whether we’re talking applications, analytics, machine learning across different modalities of data, whether we’re talking structured data, semi structured, unstructured. The goal of the book is to really equip practitioners with an understanding of data modeling end to end. And so I think it’s what I consider to be sort of the next phase of where data modeling is going is not just about tables anymore. It’s much more than that. We’re working with different types of data across many different use cases. And so the goal of this book is obviously to equip practitioners with, you know, a body of knowledge of the existing techniques, as well as hopefully introducing some new ones as well. The working title is mixed model Arts, which is sort of a play on words of Mixed Martial Arts, so I can kind of understand where the threat is coming from. And I think the inspiration comes back. I grew up in the 80s, you know. I grew up watching really trashy TV like kung fu theater and wrestling and, yeah, all this stuff, and boxing and, you know. And I think back in the day combat sports, fighting was very one dimensional. You can be a boxer or a pro wrestler in your speedo, or Hyun fu master in the mountains in China or something. But there was always this notion that, you know, the you know, the questions are always like, who would win a fight? Like, would Bruce Lee beat Mike Tyson, right? You know, in a boxing match, or, you know, or under some set of rules. But UFC, you know, they came around the early 90s. There are obviously other things before, like Valley tudo in Brazil, which is early X martial arts. But UFC, I think, was the mainstream. They lit off the notion of being a one dimensional martial artist. Fast forward to today, and you couldn’t tell me that, you know, the best box in the world. If that person gets into the ring and UFC, they would do very well, or any only one dimensional sport. So think about them but if you take sort of the parallels to this with data, we’ve been stuck in the past. We’ve been stuck with these notions that, you know, that’s this one true way to model data. You know, there’s one technique to rule them all. You know, I think we’ve been, like I said, stuck in a table centric view of the world and sort of, you know, it’s almost akin to thinking the universe revolves around the Sun and right? You know, the world’s moved on. You know, we have endless amounts of different ways of storing and querying data. We have different ways of moving data streaming is becoming increasingly popular, and has been for a long time. Machine learning is everywhere, and now it’s AI and so, you know. But I feel like hopefully the world of data modeling and data in general starts catching up to where we are. So that’s part of the effort of the book.

 

John Wessel  04:18

Yeah, awesome. All right, Joe, so I got to ask, before we dig in a little bit more on the book, what inspired you, like, at what point did you decide, like, I’m I want to write, I want to write books. Because, because you were, if I remember a data scientist A while ago, then you kind of evolved from there. But at what point were books part of the equation for you?

Joe Reis  04:38

I mean, they’ve always wanted to write a book. I’ve been writing, you know, blogs forever, but I think during COVID, everyone had their little project. You know, we showed the Learn to brew beer, bake bread, or become a gardener, something knitting, or

John Wessel  04:53

whatever you want to do. Yeah, right? Yeah, sour dough. That’s another sour

Joe Reis  04:57

That’s a big one. Everyone’s way into sour dough. But, you know, I think Matt Housley, my co-author and business partner, decided to write a book on data engineering. And why would we do that? I think that when we looked at a lot of the books out there, we realized there wasn’t a book that really gave data engineering fundamental treatment from first principles. Typically, let’s push data engineering from the perspective of teaching data engineering, which hadn’t really been defined by using tools like, you know, and platforms. And so we felt like, take a step back, maybe try and give a sense of order, and, you know, a framework to think about the field of data engineering, which I don’t think had existed before. And so, you know, I think the book is just, it’s a primer, right? If you’re gonna do data engineering, we don’t write a book where, if you read it, you could at least be equipped, I think, with a pretty basic knowledge of what it would be like to operate as a data engineer. But we wrote it in a way that was agnostic of technology or tools. Because I don’t, my personal opinion is, I don’t believe that in this day and age, you need to write books on technologies or tools. I think perhaps use them as examples, but the rate things change, those are better off being course points. I just think that books, it’s fundamentally different. It’s a difficult exercise to write a book, especially a book that’s actually simple, right? I think one of the criticisms of Fundamentals of Data engineering is like, well, I did. I knew all this stuff before. I was like, well, that’s great. Good for you, but you write it all too. That’s the other question. That’s the other question. That’s the other part. And I could guarantee, unless these people aren’t so, you know, and I think, but to write something simple, to bring complexity, to bring simplicity to complexity, I think is very difficult to do. Yeah, I think that was a challenge. But yeah, anyway, that’s why we wrote the book.

Matthew Kelliher-Gibson  06:40

I think also, to kind of go to your point there, there’s a lot of stuff, especially for the kind of technology, specific that it has, that has an expiration date on it. Those things are going to change over time, and the principles are more important for you to learn. So it doesn’t matter what the specific tech is, you can always adapt to it.

Joe Reis  07:00

Oh, absolutely. Like, right before we hopped on, I was advising a university on their data curriculum. And, you know, they had things like data mining and big data with Hadoop and all this. It was like, why are you teaching this in this day and age? This is antiquated, you know, so. But yeah, their learning principles are still there. They’re still widely used. But, you know, technologies come and go. I would say, you know Hadoop, right? If you’re to teach that as a as, if you’re a class in Hadoop these days, I would say you’re probably, yeah, students out of their money, but so teach it as a historical artifact, but it’s sort of like teaching, I don’t know how to churn butter manually or something, right? But sure, it has a place. But

John Wessel  07:38

Anyway, well, it’s funny , it’s a funny space, right? Because you’ve got like, commercial SaaS companies that like, if they were to produce literature, they’re going to produce, like, how to use their tool, courses on how to use their tool. If you’re in academia, then you have like, we want to keep this really theoretical. And then often you have people that, like, learned R or learned Hadoop or something, and they want to keep teaching that class over and over again. They don’t want to reinvent the class every reinvent the class every year necessarily, because

Matthew Kelliher-Gibson  08:05

That’s a lot of work. Speaking of when I was in grad school and they taught us, SaaS, oh, sure, yeah, where you have a partnership,

John Wessel  08:09

yeah, yeah. But

Joe Reis  08:12

This is a tension in academia, especially, right, where tenured professors don’t want to revamp their courses because it takes away from the research time? Sure. Yeah, right. But then as I was telling this university, I said that that’s great and all congrats on your research papers, probably of which are gonna 10 people are gonna read it. Meanwhile, you know, you have students, especially international students, who don’t get, you know, any discounts on their tuition, who are paying top dollar, right for an education. And my impression is, if I’m a student paying this kind of money for this education from this institution, give me the best education that’s relevant to helping me, you know, get a good start in my career, right? So, which

Matthew Kelliher-Gibson  08:53

is also interesting, because so much of the data programs that you see out there, I remember when I first started being a manager, and I had to hire for this. And that was when the, like, the data science programs first came out, yeah, and they were so applied based to, like, this is how you called this library to and then run this code. And there was almost no theoretical understanding behind, like, a boot camp style, yeah, but, like a boot camp, but for like, 10 times the price, interesting,

Joe Reis  09:24

It’s crazy, right? Data science is a big one. That was a big one, and everyone wanted to jump into it, because it was the sexiest job of the 21st century. You know, if you weren’t doing data science, you’re just gonna be left behind. And, yeah, things are bad. Yeah, yeah. And that

Matthew Kelliher-Gibson  09:37

was, like, when I was hiring. We actually made the decision after interviewing a bunch of people for data analyst roles that we said we weren’t going to take anyone from a master’s program because we would have to unteach them so much information, it was easier to take someone just out of undergrad and teach them how to do it, right? Wow,

Joe Reis  09:56

That’s crazy. So did you have your own you. Like an apprenticeship program. Then in place. Oh,

Matthew Kelliher-Gibson  10:02

yeah. So we had a whole, I was a crazy first year manager. I developed the entire, like, analyst training program with the kind of the top two people on my team. So we had it because we had a data scientist who could teach you a lot of stuff. He was in what was a time called NLP, like I had a data engineer who was a, you know, former DBA and full stack developer. So he really understood, and he could come at a lot of stuff with data from, like, set theory and things like that. So he was really strong. And then I basically sold people on it by saying, I’m gonna teach you how to think, and I’m gonna teach you more about how business works. And so it’s awesome. I had a, like, 24 book reading curriculum. I took them through like, a year. Wow, damn. It was intense, but

Joe Reis  10:47

that’s what it takes, I think, right? And in school you did that. I don’t think a lot of companies or managers have that initiative or insanity to do something like that in a good way, you know. But what you often see is sort of like that if you don’t have a standard body of knowledge and standard expectations, I think what you find is you probably know what happens. People do all kinds of crazy stuff on their own, and they make up, they fill in the blanks, right? And you can’t blame them. They hadn’t been trained otherwise, you know, and the manager has only themselves to blame at the end of the day, so, but standardization is hard and skills and knowledge has been taught, it’s commendable. So, and I think there’s

Matthew Kelliher-Gibson  11:25

a little bit of a cursive knowledge in that, for a lot of people, they’ve been doing it for a while, and they’ve built up that kind of background knowledge and a lot of things. And then they come back to it and they say, like, oh, well, you don’t need to do all this stuff because, you know, it’s really just these simple things. And you don’t realize that like now, you have a lot of guard rails and a lot of ideas that keep you on the right path, because you had all this other stuff that you went through and all this other training. Right?

John Wessel  11:52

How did you question yourself? Matt, how did you pitch that internally? Because that sounds like a pretty heavy investment from the company and these people, and that seems like a challenge.

Matthew Kelliher-Gibson  12:03

I kind of just didn’t tell them, Yes. I said we were going to just train people internally. And everyone went, Okay, yeah, because it kept them kind of blind to the details. Is also saving us like $90,000 in the process after our budget. So once people saw that, they just kind of stopped, yeah,

Joe Reis  12:23

bad hires and stuff and return

Matthew Kelliher-Gibson  12:25

we’d so I initially pitched it when we had a certain budget. For Long Story Short, I was on a team that was like four people. I got promoted, and two of the people got promoted elsewhere. So I had people that were, like, fairly expensive, that I was going to be able to back fill with along with adding a few new positions. And like one of them originally was we, I pitched as a junior data scientist, and then we started interviewing people, and I went, Oh no, though, no, that does not exist. Okay, forget about that, right? So we downgraded two of the positions to data analysts, which meant I didn’t have to pay as much for them.

John Wessel  12:59

But titles don’t matter right now. Titles don’t matter.

Matthew Kelliher-Gibson  13:07

So so because of that, and there have been some internal pressure to, like, well, like the two people, I hired two really strong senior people, and I actually had HR trying to talk me out of it, because they’re like, well, we Yeah, they’re really good, but you know, maybe it would be better if we hired someone, like, a little less skilled here, a little longer. And I’m like, that’s insane. I’m not doing that. So it was kind of that trade off that I went with, and I just kind of generically told him, I’m going to do stuff. And I told my boss, hey, you’re going to see some expenses from me for some books periodically. Yeah. And he went, Okay. And he was a new VP, so he was so busy, yeah, really look at me. You’re saying you hired at the higher

John Wessel  13:47

end of like, the data analyst, like band, essentially, yeah.

Matthew Kelliher-Gibson  13:51

So we had some people that we liked very much, they were at the very top of the data analyst band, and they got promoted into other places. So then I hired someone out of college who I could pay like, half the price for. And I hired another person who was actually a who’s a woman who had not worked in a while because she had raised her kids, she had gone back and she had, like, got an associate’s degree in programming, was a database person, and was the strongest data analyst I think I’ve worked with ever since Nice. That’s cool. So, we literally pitched it. We changed the whole job posting to be like, here’s the 10 things you’re going to do in this job. Who’s like, writing documentation and stuff like that, just to make sure that everyone was very clear about what you were going to

John Wessel  14:41

be doing here. Yeah, so Joe, I’m curious, like, you’ve, you’ve worked at least a couple services, type businesses, like, how did you guys approach something about hiring? I mean,

Joe Reis  14:49

it’s an interesting one, right? I think hiring is one of these things where I’ve hired a lot in the past. You’re not outside of my own company, right? So, but I feel like. You never know. I think your example of this person who was, you know, a stay at home mom, which is, you know, I think a very difficult job, probably harder than definitely, yes, yeah. I think some people still don’t get it, but whatever, I always look at aptitude and things like character and your ability to continuously learn. And you know, I think you are an ad for the team? In that regard, I very rarely look at your credentials in terms of what you know, what school you went to, what big, low logos or companies you happen to have on your resume. To me, those are great signals. But I’ve just known too many people who have gone to the finest universities in the world, who have had impressive titles at the biggest companies in the world, who I think are just, yeah, they kind of douche bags. So, you know, I just don’t think that they’re not the kind of people I would have hired. I think there’s a lot of grief, and I can swear to your podcast, right? Well, there’s a lot of, I would say there’s just a lot of bad behavior, right? So the only thing I really screen for is, do you have the ability to, I think, be a good person with your teammates, you know, add value and continuously learn, and are you a person of good integrity and character, things that really can’t be taught? How do I assess this? I just got to know you. I don’t know. I mean, it’s pretty it’s probably not scalable, but to me, it’s, you know, I’ll look at, I’ll look at your social media. I’ll kind of see what you’re about. I’ll, you know, I think that if I hear words a lot, like, if it’s, if the conversations are generally about, you know, yourself and kind of what you’re trying to get out of stuff, then I assume that you’re very driven, you know, to look out for yourself Sure, and not contribute to the team, or if it’s motivations like, you know, trying to climb the, you know, the corporate ladder and stuff, and you have a history of stabbing people in the back to get there, and that’s trying not quite a kind of person that I would want to work with. And there’s, I’m sure there’s plenty of fine, outstanding companies out there that’ll hire you where you can, or that behavior is institutionalized at this point. That’s how I hire. I’m pretty old school in that regard. Where I am again, I look at things that you can’t teach, right? Like the favorite person who continuously learns that’s hard to teach. I can’t teach you to do that, right? Yeah, no matter how hard you try, you can fake it, but ultimately, self motivation and character and integrity are the things I look for. Yeah, yeah.

Matthew Kelliher-Gibson  17:28

I mean, especially on self motivation. I mean, I mean, at this point, I’ve had, you know, just a bunch of people I worked with who would even ask, like, Hey, how can I try to learn more about data or try to break into it, gladly give them stuff, and one out of maybe every 10 would actually do network,

Joe Reis  17:44

Yep, yeah. But you said so then you have your answer, right? Because, because the thing is, business moves fast. And you know, the nature of a business is to solve problems. Yeah, prefer your customers. And so you’re always trying to solve new problems, which means, if there was a standard playbook for solving problems, I mean, I could just write a program to do that, or use an AI, exactly, yeah? So by definition, you’re expected to creatively solve problems in a continuous fashion.

John Wessel  18:13

Yeah, yeah, Matt, I’ve tried to do something similar, find a way to gauge like, willingness and ability to learn in that, like, will they, like, read an article, like, like, like, take any sort of like, you know, I say hey, like, this book is good on the subject. Or, like, hey, this article is a podcast. Like, take any sort of interest in any sort of learning, right? Like, I found it was good. And here’s one that I’ll ask both of you to have I have seen. I feel like this used to be a thing, and people don’t do it anymore, maybe just because they’re lazy. Do you ever call references or ask for references with people in interviews, like when you’ve done that in the past? Because it used to be a bigger thing, and I haven’t heard people do that as much anymore.

Matthew Kelliher-Gibson  18:56

Well, I mean, I think it’s still out there, but does it matter? Because, I mean, I mean, I used to work at places where HR insisted they had to do and I was like, Okay, tell me if they say something terrible, because then that says they’re idiots Other than that,

Joe Reis  19:09

right? I don’t really get out of it. It can be a game so easy. I mean, if you, if any of us had, if any of us were asked for references, I’m sure we would call like, and give a heads up and just say, hey, like, just make sure you put in a good word for me, right? I don’t think, I don’t believe in references. I’m sure that maybe, depending on the company that might be required, or just part of the world, my limit says is, if somebody refers you to me, I take that pretty heavy, especially somebody who I respect. Good point, but my but, if it’s you know, but I’ll say, are you willing to risk our friendship or your job on this person? Good question

Matthew Kelliher-Gibson  19:44

is a good question. I mean, I’ve done kind of a similar thing where I’ve seen people who worked at companies of friends of mine and like that saved me one where I said, Hey, you know, I’m seeing this person. Do you know who they are? And they went, Oh yeah, he is quite possibly the laziest person. I. I’ve ever worked Oh, yeah, wow. I was like, okay, scratch him off the list. Yeah,

Joe Reis  20:04

beer next time. Yep, yeah. And that’s just, that’s just it. I mean, and these days, it’s so easy to find stuff on people. I mean, everyone’s got a social You’re right. Or if they don’t, then that’s also a big red flag. They’re probably a serial killer or something, you know, in which case, like, that person’s almost weirder. So, like, if they’re not on LinkedIn, for example, right? Like, that’s the first thing. Like, yeah, not on LinkedIn. Like, either you’re Amish or you don’t understand networking and hiring, right, in today’s age, right, right? Maybe you don’t care. Maybe you’re a hipster, maybe you still have an AOL account or something too. That’s cool, but we work in tech, yeah, right, you know. And I do look at things that, like, Who are you connected to, right? It’s because I want to understand, okay, so who’s, at least visibly, who’s your network, right? Yeah, it’s, you know, in fact, people recommending you. What are they recommending you for? I mean, that to me, is more of a reference, yeah, recommendations used to be gamed pretty easy. Like, I think one day, me and a friend recommended one of our co-workers for stuff like ballet and horse training and stuff like that. So I don’t know, I mean, one of my friends, he was, he got a recommendation or endorsement for

Matthew Kelliher-Gibson  21:15

crying. That’s brutal. I have used in the past, though, specifically for some roles where they’re like, they may not be a total tech role, but there’s like, a combo marketing analyst or something like that, where you just go and you look at what skills have they tagged for themselves? Because if they don’t have any tech skills tagged, that’s a red flag to me, if all they have is hard work or communicator, marketing, strategy, whatever it is, like, whatever the role is, but they don’t have any specific or like excels the most technical it is, that’s generally a red flag that I would look

Joe Reis  21:55

for that now, of course, it was interesting, like I was talking to a friend of mine last night who works in a non technical industry who wants to become, you know. So this person works in product management, and, you know, the health and fitness space, right? Really awesome at it, but it’s kind of boring because it’s super saturated and it’s health and fitness and, yep, it’s, it’s the same stuff year after year. You know, this person wants a branch into tech. And I’m like, okay, so how are you going to do this? Right? If you’re to go through an applicant tracking system with your resume, it’s probably not going to go very far. Yeah, it’s going to get kicked exactly, yeah, yeah. So, like, Go to meetups, meet people, start learning about tech, right? And start, you know, I would say, like, gain proficiency, you know, not just a book, you know, a conversational level, but dive into it and, but you’re going to have to demonstrate competence in an area where all the odds are stacked against you, yeah. But if there’s something you want to do, then there’s ways to do it. I think, you know, people do transition. Data and tech are notorious for people transitioning into it from adjacent fields, right? And, yeah, it happens all the time. So you may have

Matthew Kelliher-Gibson  22:56

to take a little bit of a step back when you first go in, depending on what level you’re coming from. Like, I have a friend who actually reached out to me about that, and like, I told him, in his situation, his best chance was to get to know someone who could hire him. Yeah, because I said it’s going to be a personal relationship. Is going to be the thing that helps you break in the fact that someone knows you and says, I can see them, and I kind of trust their ability to learn this. Because if you’re just doing it blindly off of an application, you know, it’s extremely difficult. Oh,

23:25

a difficult that’s played

John Wessel  23:28

nearly impossible, yeah.

Joe Reis  23:30

Because, I mean, you know, think of how many people resumes people are shooting out these days. I mean, the job market and tech and data ain’t great. So, I mean, I know people that are, you know, people with resumes too. Like, like, well, that sounds stupid. They have an established resume, right, right, right, but, you know, like, you know, they’ve been looking for work for 18 months now. Yeah. And part of it, I think that you point out Matt, the network too, where, that’s the other part of it, where it’s great to have all these skills, and I wrote about this over the weekend, and my sub stack, where it was like, What’s your path? Like, what’s your path, like, what’s your what’s your brand, what’s your network. And the whole point of it was, you could be just to nugget resonated with people too, where it’s like, you could be really well known inside your company. You could be awesome inside your company. But the thing is, if nobody else knows about you outside of your company, yeah, right. You know, what is it really worth that much at the end of the day? Because, you know, especially if layoffs happen, everyone that just got laid off is looking out for themselves. You know, they might try and help each other out, but, you know, there’s a bunch of people on this on the market now, and it’s that’s a paradox, where I think what you see is a lot of people put a lot of effort into being just the biggest rock star in the company. But the thing is, it’s self contained, you know. And so how public are you? In terms of getting out there, I think it’s increasingly a big ingredient to success, whereas in the past, that wasn’t the case because you had more job security and so forth, there’s more of an expectation, like, hey, if I just work hard and do a good job and just keep my nose the grind zone, like, I’m gonna get recognized someday and like, that doesn’t happen anymore. If you’re waiting on that, I would say, like, good luck.

Matthew Kelliher-Gibson  24:58

Yeah. Meaning, I think about how much that really worked. I think in the past, sometimes it was tough too, that you got to be good at something, but people have to know that you’re good at it. It’s not just going to leak out into the public.

John Wessel  25:12

Yeah, I always have this three three circle model, if you can imagine, like, three circles overlapping one circle. And then I’d use this, like with employees, to try to explain, like, somebody was frustrated, like, I feel stuck career wise. Like, I want to get promoted. I want to make more money. So like, all right, think about these three circles. Serial number one is your market value. Circle. Number two is your value to the company. And those two things overlap in a spot. And there’s areas that they don’t overlap because they’re market value, things that you might be able to do that we don’t care about. And then there’s things you can do for us that’s super valuable that nobody else is going to care about. And then the third circle is, what do you actually like doing? So I tell people, like, you want to try to optimize, like, what do you actually like to do? We’re like, what’s the market value? And then you have to provide value in your current job. And if you can figure that out, like, that is typically, like, the right, like, intersection of three things for most people when it comes to, like, to career, yeah. But

Matthew Kelliher-Gibson  26:03

I’d say even now, though, because we’ve seen this flood into the market in the last, I don’t know, five ish years or so, and now that we’ve had this a little bit of a contraction, and everything that actually, you know, has become a lot more important, sure. Oh, yeah, yeah. And I think even long term, like, especially if you’re going to go into more of a leadership role or something, I think an undervalued part of what you bring is your connections, is your network, whether it comes to hiring or influencing or whatever, like, that is something that is there to help you get a job, but also within those jobs, Oh,

John Wessel  26:41

yeah. And in any sort of leadership level, right? If you have to, like, bring in people, like, certainly helps if you aren’t, especially if you’re doing a turn around, like, if you turn a team around, it sure does. It helps you a lot to have the network of people you can pull from to be, like, all right, we’re gonna turn this around in 18 months. Like, like, Let’s go do it, versus, like, cold interviews and high like, that’s just

Matthew Kelliher-Gibson  27:03

actually a red flag for me. If I go look at an executive at some place, if I see they’ve been there for a year or two, and they haven’t brought anyone over from a previous job, from any previous from any previous job, they’ve only, you know, done the normal recruiting process, because that tells me you have either burned all your bridges or, like, you don’t do a good job of relationship building in the first part. And

Joe Reis  27:25

That is interesting. It happens, yeah, yeah. It only takes a few of those, those things that happen, and then yeah, but yeah. It’s an interesting market, you know, and I’m curious to see what happens. Especially, I’m sure we wanted to throw these two letters out. AI, what? You know, I’m curious to see what happens with AI in how it affects hiring and work and so forth. Well, what’s

John Wessel  27:47

Well, I want to talk about that. What, what is your take on that? Because I have a perception. And we talked about this a little bit before the show. I have a perception that there are plenty of companies that are gonna do, like, the wait and see thing, of course, of like, ah, like, let’s see, like, at what point is it like, do we really stay lean? Just get as much as we can out of the existing people. We’ll throw you a bow and like, here’s some money to spend on AI tools. Like, at what point does that potentially get exhausted? And people say, like, all right, we just need to hire more people. Like, we can’t just throw AI on it and, like, squeeze an extra like, 10% out of this team. Or maybe that’s not happening, though. It’s just kind of a perceived thing I have from some companies I’ve worked with, at least. What

28:26

gives you that perception that, like,

John Wessel  28:28

companies I’ve worked with, it’s all like, oh, like, well, we can be more efficient now. Like, here’s an AI tool, like, developers, like, this makes you 30% more effective. Microsoft said so. Like, so you only need three on the team instead of five, you know, or whatever. The math is there

Matthew Kelliher-Gibson  28:42

that four. It’s more like a wish cast thing than an actual plan. But,

John Wessel  28:47

Of course, no, it’s not a plan. It’s just like an executive reading a headline about a co-pilot, or whatever. I’m just like saying co-pilot, but whatever tool makes developers x more percent effective, therefore, like, we need that less people. We need 30% less people.

Joe Reis  29:02

I think it’s the wish of every executive that they could just have a company run by conceivably no people, billions of dollars a year, of course, and so forth, right? And I think, you know, there’s a lot of people inspired by, you know, Sam Altman’s prediction that, you know, somebody’s gonna build the world’s first one person billion dollar company, and, right? We’ll see if it happens. I mean, you know, I’m starting a new company right now, and I’m trying to leverage as much you know, automation and make sense, AI is as humanly possible. Like, why not? But I’m starting from nothing too. I have no entrenched processes or resistance from anybody, because there’s nobody here, right? It’s just me and my friends. So, so that

John Wessel  29:39

would be a fun thought experiment. So say your starting company right now, say you were doing this 10 years prior to now. What do you think your differential is on, on, on people roughly between, if it was like five or 10 years ago and now, given the

Joe Reis  29:53

state of technology back then? Yeah, exactly. Yeah. I mean, it’s an interesting one. I mean starting companies back then, so I have to. A thing of an opinion on that, and working at startups, I think back then, you know, you had it was, it was basically when, when SaaS was just getting hot, right? So, but what you saw was, you could definitely sign up for a lot of services, you know, like it would handle your expenses. Payroll got a lot easier than it was. I mean, I don’t know if you remember payroll before, SaaS still sucks, but it’s easier. Yeah, you know what else? Just HR management, all the kind of the stuff you don’t want to think about, but you have to, like, that’s gotten a ton easier. There’s still a lot of friction involved, because you’re dealing with people at the end of the day, but stuff like documents, workflow tools are easier. I would say workflow management and task management is conceivably easier, except for the fact that, again, you have people who don’t even use the tools. So I don’t care if you’re using JIRA or ASA or any other great product, they’re all great. Can’t blame them. So, you know, I mean, back then it was okay. So how would you, you know, so there’s, there’s a plethora of great SaaS tools. And then, you know, if you’re hiring developers, it’s, you know, back then it was easier to find developers, it’s still hard to find good ones. That hasn’t changed, but software development life cycles are what they are. I don’t think that’s changed that much. I think we’re still doing basically the same stuff that we’ve done in the past. Now we just happen to have AI co-pilots, and I think to the degree they’re effective, I think it depends, from what I’m seeing in my own experience. It depends on the type of problem you’re trying to solve and the language you’re using. It works really well in Python, for example, I think it does. It doesn’t work so well in other languages, according to my friends who work in more esoteric stuff, like, I don’t know elixir, for example. Like, you know so, but I would never write an app in elixir, so I don’t really care, sure. Oh, I have no reason to do that. No shout out to all the elixir people out there. So yeah, but I think it’s and then with content, it’s interesting, because on one hand, llms have made content production conceivably easier. I could say it’s also made it conceivably worse. I mean, on the table content you’re producing, it’s kind of

Matthew Kelliher-Gibson  31:57

I like a lot of technology. It hollows out the middle, so the mediocre stuff now gets a lot easier to do, yep, but that just means we get flooded with a lot of mediocre to below average stuff. Oh,

Joe Reis  32:07

yeah. Look at any social media right now, and you can see it. It’s, you know, you can tell when you listen to the person speaking with Reno the script, or if you read the copy, a lot of it’s just super generic, looking like there’s no personality. So, you know. So on one hand, I think the rote tasks are going to get easier. I mean, I think it’s still early days of the agent taking workflows like, I still think that it’s, there’s, it does, it feels like maybe another year, and that’s going to be a bit more baked in, I think, and useful. But, you know, I mean, I use LMS all the time. I, you know, I have a pro subscription to all of them. Why? Why? Because I’m, you know, I at least want to, you know, you know, use them where I can and experiment, you know, in areas where I think they’re, you know, probably in a year or two, they’re going to be there. But I don’t think it’s going away. What does this do for hiring? What does it do for jobs? I don’t know. I think that’s TBD, yeah, you know, you read about Klarna that says they put us on a hiring freeze recently, and they’re just going to, you know, run the company with AI. We’ll see if that works. I think they said they’re also correct. If I’m wrong, I thought they’re getting Salesforce and trying to build their own ai on top of that. So the speculation too, this is an interesting thread where, okay, so the nature of software is going to change, right where, instead of application workflows, I have agent workflows. I don’t know if this is marketing speak or if it’s real, I think, but this is an exciting time you get to try this stuff out and see where it works at a ground level in your business. I think that’s super cool. So

Matthew Kelliher-Gibson  33:26

one of the things I wonder about that you kind of touched on there is, like, we had all these SaaS tools came up, and it was supposed to change everything. But a lot of times it was, well, you have a problem. We’ve created technology. It’s going to fix it. And it’s like, okay, but if no one fills in the stuff, it doesn’t matter what. Matter what your technology is exactly. So a lot of this kind of AI agent stuff, do you think they are? Are we getting to where there’s more of the people, you know, we’re handling more of those, like, people process things? Or is it just like, here’s a souped up technology that, if people don’t use, it still doesn’t matter? I

Joe Reis  34:00

I think it’s definitely the latter for now. I: But at the rate these things are changing, either agentic AI is going to be, you know, a flash in the pound and, you know, or if it continues, I think it’ll be great, like, if you look at something like Devin, right? I think it’s super early days for that kind of workflow where you can have a, quote, Junior software, you know, software engineer that happens to be a bot, but I don’t know. I mean, I’ve, I was around when people said the internet was a fad as well, so that, you know, that would have been one of the dumbest things you could say. Now, so who knows, right? I just, I even blockchain, right? I’ve, I don’t see any utility in it, but who knows. Maybe there will be at some point, you know, people make a lot of money in it, but that doesn’t, that’s not the same as utility that you can speculate on it. That’s all speculation at this point. Yeah, right. But so I don’t know, I mean, but I’ve just learned I don’t discount it, I don’t write anything off. But, you know, out of hand, you know, I’m more of the kind of person who looks but chat GPT was interesting, right? So I think, unlike you know, you know, Bitcoin. Mean, what they came out with, that paper came out with 2007-2008 or something like that, right? Yeah. I think we’re still waiting for the use case where, apart from coming up with meme coins, we’re going to, you know, change the world

Matthew Kelliher-Gibson  35:09

with it, or laundering for illegal activity, sure, right?

Joe Reis  35:13

I mean, yeah. I mean, that probably revolutionized money laundering, you know. So, I mean, in that, you know, that accounts for a lot of money, that’s, it’s not discounted. I’m not advocating it, but I’m, you know, it did probably streamline that industry, or that that way of transacting is that the mainstream, no, but, you know, contrast that with chat GPT were, when that came out was, like two years and 20 or 19 days ago. You know that that changed the face of a lot of things, like, I could hand my child chat GPT, and they knew exactly what to do with it out of the game. That’s different. So, and then, you know, you have every CEO in the world who’s using chat GPT, and their kids are using it. Like, okay, so how can I do this in my business? Like, what can I do right? You know, that seems to

Matthew Kelliher-Gibson  35:57

be one of the problems. I mean, we’ve talked about this before, the idea of, well, look, I can go on chat, GPT, and I can do this thing. Why can’t you do it? It’s like, yes, you can make it work once. Now make it work a million times without variation,

John Wessel  36:10

yeah, at scale for a million different people. Yeah.

Joe Reis  36:14

So too, it’s got, I think, you know, that’s where the edges of utility, at least for me, is like, where do I need something as deterministic, and where do I need something that’s kind of, you know, it can be more fuzzy with right? And so, if it’s deterministic, all this right code, because I know that’s going to work every single time. But then I can also use a co-pilot that can, in theory, speed up my development. I’m still at about 60 or 70% success rate with that. I certainly like it, I have no idea why you suggested this code at all, yeah, but it’s, you know, it definitely helps speed things up. So I look at more of an accelerant. But the crux of it is, you have to know what you’re doing, right? So this comes back to what you were doing with your training, you know. And God bless you for doing that. I think that’s awesome, because if you don’t know what to look for now, you just do dumb things extremely fast. Yes, yeah,

John Wessel  37:04

Yes, it can. It can accelerate pain for a lot of people because you know, because you can. And that’s the problem, in my opinion, with pro whether you’re coding for apps or data pipelines or whatever, you can make things work and do it in some really terrible ways, and co pilot and other similar technology can absolutely just be fuel on the fire to do that and then end up with bigger messes than if it was all done by hand poorly.

Matthew Kelliher-Gibson  37:33

Yeah. I mean, I worked at a place where we did, we were doing loan auto decisioning, and that was one where like you to be, Oh, interesting, careful on that stuff, right, especially as regulated industry and all that stuff. And that was a unique one, because I didn’t make the models, but I owned the code base that did all the decisioning on it, so we had to be, that was an interesting problem, especially when it came to, how do we test to make sure stuff is working, right? But those are the types of things that, like we had, defaults that if anything fails, you would do a certain action because of the amount of money it would cost to just do bad stuff, right? It was huge. Yeah,

Joe Reis  38:12

That’s interesting. Auto Loans, that was it, right, yeah, yeah, that’s an interesting business. Yeah, I ran into a guy once. This is back in the late 2000s I think, he’s trying to hire me for something, but he was doing sub prime auto loans. This is fascinating. I know that space a bit, okay, okay, yeah, but he had a spreadsheet. And I have an actuarial background, so I was fascinated by this. Like, this. It basically just works pooling. That’s all you’re doing. It was interesting. I mean, he managed to make a lot of money. I think then this is actually during the sub prime crisis as well, and he is still doing well. So, yeah,

Matthew Kelliher-Gibson  38:54

well, the key on the sub prime stuff is, generally, you’re gonna, you have to just account for. Can I make a profit if someone doesn’t pay this back? Yep, that’s literally what it is. And then a lot of it, if you can correctly kind of rank risk, and if you can make a model where people will take the loan and you can make money back, and you know, you can make money in a year or two, of them making payments, and you keep your expenses low, then you can do it. And so that was very, you know, for a lot of it was really easy as interest rates went up. That started with that was actually the big thing that started squeezing some of those was interest rates went up, cost of capital went up. That was a huge squeeze on those types of profits. So

John Wessel  39:38

I want to make sure we get some data modeling here. So Joe, tell us, give us, like, a just kind of a brief on your book you’re working on now and then. Curious from you, Matt, too, after we talked through that, like, on what modeling stuff you’ve done in the past, but yeah, give us a brief on the book. I

Joe Reis  39:55

mean, so yeah, I was like he hit on earlier, yeah. It’s going over, I think a lot of the established, well trodden concepts, but revisiting them, I think, from again, kind of a first principle standpoint. So at first, the books are first, you know, back up, the book starts off with, like, what is data modeling? Why do we even bother doing it? Why is this important in today’s age? Why don’t we just ignore it? Right? Yeah. And then we go through the history of what I call the convergence, which is, you know, the fields of computing, analytics and data, and then AI, right? And in the past, these were sort of all separate fields. Maybe there was some overlap in some like using a computer to run an AI program or so forth. But, you know, the fields of study had been very, I’d say, isolated. But over the decades, what you see is, you know, all these tend to converge. And this is where notion mix model arts comes from. It’s data modeling around the notion of the convergence of different use cases of data across different types of data. This is reality right now, right? If you use any app out there, see Uber, Netflix, whatever, right? This isn’t just some janky Ruby on Rails app. It’s a very data intensive, robust analytics and ml powered application. This is where the world is and the world the world is going and so, you know, then the book basically gets into some of the building blocks of data modeling. So if we look at the notion of an entity, right, that’s pretty well trod and then tabular data. But now, how do we extend this into things like semi structured and unstructured data? Especially when you talk about unstructured data like an image or Texas, it gets very interesting. Entity could mean a lot of things. Actually, entity resolution could be, well, anything that’s in that body of unstructured data, or it could just be the file itself, right? And so this is so I think it’s helping, I think, expand people’s thinking. And then obviously that what I’m working on right now, which I hope to publish this week, is not the exploration of the relational data model. Right came out in 1869 1970 and that’s sort of the, I would say. That is the underpinning of how you work with tabular data. Everything that’s derivative of the relational model. It was the first model that really took into account. How do you know, if you take a step back? It’s what I’ve been writing today. Is basically the underlying math, mathematical principles of the relational model. What is relational theory? How does this work? So how do we translate this into tables, and what are the shortcomings of this approach? Right? Something that actually doesn’t get discussed is when we talk about things like tables and SQL, this actually doesn’t map correctly back to the relational model. There’s a lot of flaws, namely duplicates, nulls, ordering of data, which, if you look back at Basic Set Theory, you can have nulls. You can have a null element. That doesn’t make any sense, but we allow nulls in tables, right? That violates the relational model. You know, you can’t have duplicates in sets by definition, but you can have duplicates in tables all day. So it’s exploring these things, right? And I think just establishing, you know, the, I think a theoretical and then a practical example, baseline, you know, of each different use case, but, you know, and giving treatment to, again, the big ideas, if you’re talking to analytics, obviously, you know Kimball, you know, modeling for data marts. You could argue that Kimball is data modeling, at least, would describe it. Then Data Vault and one big table, which is popular these days, right? But then why would you use and, I think, and also, essentially the trade offs, like, if you’re going to choose one big table, why would you choose this versus another approach? What are the trade offs? I’m not going to give you an opinion one way or the other. The goal is to make you just cognizant. Just cognizant of, okay, if I’m going to take this approach, and I know all these other techniques, just like in mixed martial arts, if I’m in a fight like I’m not going to be orthodox about I need to throw jabs only. This is how I’m going to win this fight. And maybe a hook or something like that would be if any of us were in like, an actual fight, that would be the most idiotic approach. I’ve actually done this in JS, where I said I’m only going to try and win by armbar, but, but, you know, with machine learning, right? This is where things are going. So how do you know how data works? And we’re taking tabular data, right? What is it? What’s the metal framework to use tabular data with machine learning, right? You know, basics of feature engineering and just at the big model approaches, and then what types of models are appropriate for different types of data. So and then kind of closing out with the sort of the future looking view of data modeling as it stands of, you know, today, which probably changed. But so that’s really the kind of the flow of the book is that there’s a lot to cover. But yeah, so

John Wessel  44:15

I’m curious, Matt. I want to get your take on this too. Of there I’m looking at this, I guess maybe in two ways. One of them, there is sometimes, like, an ideal model that we should use for this problem, like, when it comes to data modeling, most of the time there’s, like, we could do it a couple of different ways. It probably doesn’t ultimately matter. I’m curious about, yeah, so, but I’m curious exactly. I’m curious about that third topic of, like, what? Maybe some real life examples, if you guys can think of it. I just thought of one where the data was modeled wrong, like a project you’ve worked on, and it just haunted the team or the project for, like, for a long time, because it was like a fundamental data model problem. You want to go first. Matt.

Matthew Kelliher-Gibson  44:59

Oh, I have one right off the top. Perfect. This was back in the heyday of the SQL versus NoSQL days. Ooh, yeah. And so this was, I’ve talked about this one on the show before, but we were we had to do with scraping prices off the internet, and the executive who was in charge of this project insisted on putting it all into a Mongo NoSQL database that was huge back in the day, yeah. And when pushed on why, the response I got back from the team was because it’s the future of data, to which I said, that doesn’t mean anything. Okay. But the problem was, in every use case we had for it, the first thing we had to do was turn it into tabular data. Yeah, every freaking time. And so if you’ve ever looked at the Mongo querying language I have, the way I described it at the time was, it’s like Martian SQL. So it was like, we had to go through all of these things of where they were so proud of this database they created, and we’re like, and we have to basically break it off and do all these weird things every time we need to do something with it, yeah. Oh, and by the way, the whole point of this is to get this data into a single database table, yeah? Like, that’s where it’s all going. So why are we doing this in between stuff? Yeah,

Joe Reis  46:13

absolutely. What about you? Is something similar to that, right? I mean, great. Mongo is great if you know the use case you’re using it for, and I think know how to use it and why you’re using it, and have a good reason. I don’t think what Matt described as the reason would be a valid reason, in my opinion. But, you know, whatever, not my problem. So,

Matthew Kelliher-Gibson  46:34

yeah, that was happening.

Joe Reis  46:36

It was my problem. It’s your problem, yes, and you, you know, and what I often see, you know, if you talking apps, relational databases, right, talking varieties of Postgres, by SQL, whatever, how many times have you seen a relational database tables for an application not modeled in any real relational form, right? It’s probably the first normal form of all time, in my opinion. You know, and it’s in, you know, you may start off with good intentions or no intentions, but it’s just, there’s tables. You can put data into tables. So let’s put data into tables. This is, you know, in our apple to sort of do this. That’s great. Except, you know, when you look at, you know, this is why, I kind of go over the why of why these techniques were created in the first place. Relational model is, as the notion is to reduce data redundancy and dependencies and update anomalies, right? If so, if you have redundancies in your data, guess what happens if you try to update it or delete it? Now you have to deal with all these other places you get to do it, and you’ll probably make an error, right? Just understanding simple set theory and thinking about your data from first principles would solve that problem. It’s just a tiny bit of thinking, not even that hard. Just, is this data dependent upon this other thing? How could I just split this apart where I don’t need to duplicate my data? I just have a row of, you know, I have a table of IDs over here. They relate over here, you know? And so that’s just the notion, but you just put like an ounce of thought into what you’re doing. It’s not even that difficult. It would save you so much time down the road, because what inevitably happens is, again, you have all kinds of update anomalies, and at some point, your database starts creaking under its own load, because now it’s doing unnecessary work, right? It’s operating extremely inefficiently. And I’ve seen this happen. I had one client where they, you know, were trying to run this very data intensive application, doing lots of analytical workloads in the app database, right? So it’s an OLTP element. I think there’s Aurora, but every single time, because, you know, they’re trying these analytical workflows and these kind of quasi, you know, intelligent workflows, as a transaction occurs and kicks off all these other stored procedures, and it kicks off all this other stuff. And then they’re, like, our database is creaking under a lot of load. And, like, yeah, I can tell you why. Like, that, Shouldn’t those workloads need to be somewhere else, like, right here, but yeah, like it was at the point where they had to actually start pulling features off of their app and reducing the functionality of it in order to not completely crumble under the weight of this database. And

Matthew Kelliher-Gibson  49:13

I’ve seen that too, where, I mean, in one place I worked, they had what were basically foreign keys, but weren’t in there as foreign keys in every table, but you couldn’t connect them directly, so you had to go through. My joke was it was a snake schema, because you had to go through this whole Nick me, yeah, and it was literally 14 joins connected, and you had to look distinct on your SELECT statement. Yep. Like, that’s a problem, yeah, yeah.

John Wessel  49:41

We, I had an app so you, you like reference, Ruby on Rails. This was an app that was groovy on rails. I remember that stuff. Yeah, yep. Data. So this was an app, and one of the user features had completely no limits. Build your own. Search thing that essentially, like, you pick as many CO, as many columns for as many tables with like, whatever WHERE clause that you want, and then it builds arbitrary SQL and executes it on the database. What could go wrong?

50:12

What reason was this done?

John Wessel  50:16

It was like a transportation app. And it was part of like, hey, we want to be able to identify loads via a bunch of different characteristics and pull in all these like different fields, where is it going to? Where is it going from? Who last touched it? Like all sorts of different things. And we went through a MongoDB phase. We went through a solar phase, if you remember that, we went, Oh, yeah. Search phase of this, we went through, like, those Postgres in the back, and we went through, like, Let’s spin up several read replicas and send this stuff to the read replicas of the Postgres. So we got through all these phases. And the thing is that we were constrained by this, because this was 15 years ago, we were constrained. We’re actually on physical hardware, so like, teams now, like, you can buy your way out of a lot of these situations, if you really want to, by just continuing to throw money into larger and larger instances.

Matthew Kelliher-Gibson  51:09

I mean, I’ve even seen sites like Google, Big Query, powerful, sure, user zone, good. I mean, one place I worked, we had that, and there was one query that people were writing that literally had seven levels of nested sub queries. Wait, what? Seven levels? And somehow BigQuery was able to optimize that to run. I mean, it took a while to run, but they were able to do it. I tried teasing it out. When I got there, I’m like, This is crazy. Let me figure it out. And I got to about three levels down when I gave up. And I told them, This is where, if you had to do this on an on prem system, it would break, and you would learn how to do what’s better. So

John Wessel  51:50

I’m really curious about your take on this, because this is because, like, when you I think what all three of us learned there was a practical constraint on resources, where, like, you know, it’s a big pain to, like, procure more servers. You have this constraint, then that constraint comes off, say, 10 years ago or whatever, and now it’s like, oh, like, we don’t know what to do. Like, I don’t know, just size up the instance. Like, what do you think that’s gonna do to like, then I don’t know, to developers. What is that gonna do to people like, because that just reinforces bad behavior and inefficiency. That’s

Matthew Kelliher-Gibson  52:24

also one of the problems of when people say, just teach finance SQL and they can and, you know, we’re gonna make everyone can be an analyst type thing, right? You know, because I’ve seen it where it was an on prem server and queries took literally, like, two seconds. Queries were taking five minutes. Okay? I was like, This is the worst hardware I’ve ever seen. They migrated over to AWS, and what we found out was that finance was running queries that the first step was to get 12 billion rows and then start. These were queries that they would literally start running at the beginning of the day, and they wouldn’t finish until after lunch. They

John Wessel  53:01

literally, like, just have it spinning at their desk, like, go get lunch, yeah, turn on

Matthew Kelliher-Gibson  53:05

at 8am and then do whatever work you had to do. And then around after lunch, it would finally work. And that was what was crippling the system, course,

John Wessel  53:11

Yeah, crazy.

Joe Reis  53:13

Well, it kind of goes back to llms, though. I think that that might be one of the utilities. Is it say that, like, seven layer hell scape of sub queries or whatever certainly sounds like a bad Taco Bell burrito,

John Wessel  53:27

the seven layer de,

Joe Reis  53:30

seven layers of SQL diarrhea, so, you know. But that might be a use case for an LLM, could, you know, be helpful, just like, throw it to that and say, I don’t know what to do with this. Figure it out, like it, right? Because at that point you’re at a Hail Mary where you’re not going to do it anyway. So, like, right? I don’t know, can robots fix it? The other one I see this slide is, you know, the other place to see this slide is just like, SS workflows. I mean, this is like, the, yeah, a lot of companies together, where it’s and stored, prox in general, and all this other stuff. It’s like, there’s all this code right now that sort of runs companies, you know, and maybe the team that wrote it is still there. Maybe there’s comments, I don’t know, but probably not. If you breathe on it wrong, it might break. So I’m like, you know, this is where I see the potential for large language walls. In particular. It’s like, no, go into these code bases and just like, try and figure it out, nobody else is going to do this. This is the job for humans to do. Humans close it. But I don’t know if it’s gross. I mean,

Matthew Kelliher-Gibson  54:29

once you get that far away from it, I mean, you know, I’ve had friends who work at banks that they have, you know, a 70 year old developer who’s on a $500,000 retainer because that cobalt code breaks once a year, and they need to come in and fix it, and otherwise they just live out on the on like Key West. Yeah, they do,

John Wessel  54:47

yeah, yeah. I think we missed the boat on that one, but maybe that’ll be SSIs for but, I mean, I think the interesting part about those GUI based like s is Alteryx, like, there’s other GUI based tools. Where, like, somebody that’s not like, super, not like, maybe officially at it, maybe is on a data team. Maybe they’re just kind of on the line in no man’s land. They can build fairly sophisticated workloads, put them in an SS job, or Alteryx, or whatever, one of these GUI based tools, and you end up with business lot, lots of business logic often takes into these tools, and they end up and critical processes, but And then Michael, and then, like, that’s terrible for like, on the technical side, you got to manage it like you can’t. It’s not version control. It’s not documented, like, but on the business side, like that was, like, their best solution, because they didn’t, because a lot of times, they didn’t want to mess with it. I didn’t want to, they don’t want to get on its road map. They thought it would take forever. Like, what do you think? What do you think the right answer for that, like, tension is,

Matthew Kelliher-Gibson  55:46

I don’t know if there’s a right answer. I mean, it’s that one that, like, I’d be part of it is, you got to get into there. There isn’t a road map for that. You have to actually get in and figure out what’s going on. Yeah. Company, sure. I think there’s also, you know, these tools can be effective if contained. The problem is, once they do a little thing good, they say, let’s do it for time, yeah. And now you have these things where each department now has their own workflow that’s defining, you know, order value, average order value. And they’re all different, right? And they’re all hidden, and everyone, once it goes out of sight, everyone assumes it’s right, or they just kind of implicitly trust that. And now we get into these arguments over, what about, you know, which number is the right number, and what’s this and that? And, well, this we’re seeing, this is what the numbers say. And it’s like, well, what is the definition of that? Nobody knows anymore,

56:47

because it’s hidden. It’s all like, oh, let’s just agree

56:49

to disagree. Exactly. We

56:53

I don’t agree, but we’ll work together. We’ll just

Matthew Kelliher-Gibson  56:56

have, yeah, we’ll just have four metrics. There’s marketing, sales, we’ll just average all four of them, and that’ll be the

Joe Reis  57:03

number all the time. But, you know, so I’m hoping this is like the killer. I keep telling everybody, if they’re working on AI problems or technologies like migrations and fixing legacy code like this is the toil though, you know, at least bring it to light. Yeah, they’re, you know, I mean, transformers are made to translate stuff literally, made to translate everything for Google Translate. Like, it’s pretty good at this stuff. I totally agree. I mean, Amazon had their whole study where they had their whole press release in August where they said, like, you know, they migrated from Java eight to 17. They say, like, 4500 years of work. And I’m like, That’s awesome,

Matthew Kelliher-Gibson  57:38

yeah. And that’s one where there’s actually, you know, a lot of the stuff that AI is currently doing for people really well is lower value work. So it’s hard to kind of, for like, a better term, like, monetize that up for the cost of making these models migrations could actually pay for a lot of this stuff. Sure.

Joe Reis  57:56

Yeah. So it’s worth it. I mean, you know, nobody’s going to join a company, or very few people are going to join a company saying, I want to work on the migration project. That sounds like a lot of fun and a way to lose my career. That’s

Matthew Kelliher-Gibson  58:09

Why does everyone just say, We’ll completely build it ourselves. We’ll just build a new thing. Yeah,

58:13

yeah. How often does it work? Though?

Matthew Kelliher-Gibson  58:17

It doesn’t work, but it’s, I mean, it doesn’t work, but they view that as a better shot than trying to migrate. That

Joe Reis  58:25

It’s a temptation, like, I, you know, I can go to therapy, or I can just, you know, I can just change my friends or my spouse. It’s fine. I’ll just find

John Wessel  58:34

some, find people that reinforce, you know, I don’t want to change. I’ll get a new job.

Joe Reis  58:39

I don’t need this one anymore. Yeah, this place sucks. Yeah, even though I’m the total cause of all my problems, exactly, it’s

Matthew Kelliher-Gibson  58:47

amazing how, everywhere I go, the people are just terrible, yeah,

John Wessel  58:51

yeah, I guess I just have bad luck. This is so

58:53

unlucky, yeah, for that one,

John Wessel  58:55

yeah, all right, guys, I think we’re coming up on time here. Yeah, we’re gonna get real

Joe Reis  59:00

cynical in a second on this one. So, yeah, that’s

Matthew Kelliher-Gibson  59:03

what we’re here. That’s my job.

John Wessel  59:04

Yeah, Joe, I want, can you give us a teaser about, like, the new project, new company you’re working on? I don’t know if you’ve had any official announcements. Do you want to give us

Joe Reis  59:13

a teaser on that? Be announcing something at the end of January? It’s fine. It’s education related. Okay? It is kind of like a mafia guy, like a mafia guy, like trying to describe it as, like, I work in garbage waste disposal. No, it’s a waste disposal company coming out at the end of January. Yeah, that’s

John Wessel  59:28

awesome. Waste disposal. All right, you got, you got any takes for us? You got anything cynical you want? Is it with Matt?

Matthew Kelliher-Gibson  59:34

I mean, I think we’ve, we kind of hit at the end there with some cynical

Joe Reis  59:37

depress a bunch of people. If we keep talking, yeah, we just cut

John Wessel  59:41

it off to a guitarist.

Matthew Kelliher-Gibson  59:42

I mean, that’s just what I kind of just, I’m just sitting right there, yeah, but I think we’re good for now. All

John Wessel  59:49

right, awesome, Joe, thanks for coming on. We’ll definitely have to have you back after you officially launch. Thank your your next thing and Matt, have

Joe Reis  59:57

you guys on the pod too. Sometimes. I got a new live show coming. And up soon, and you guys are finding curmudgeonly enough, it’d be good to have you guys as well. So excellent.

Matthew Kelliher-Gibson  1:00:08

This is beyond good behavior. Yeah, this

Joe Reis  1:00:09

is pretty good behavior for you. You got an ankle bracelet or something like yesterday, we’ll let you get a beer at the bar if you behave today.

John Wessel  1:00:20

All right, awesome. Thanks. Matt, thanks. Joe, thanks, thanks. You guys.

Eric Dodds  1:00:24

The Data Stack Show is brought to you by RudderStack, the warehouse native customer data platform. RudderStack is purpose built to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com you