This week on The Data Stack Show, Eric and John welcome Ethan Aaron and John Steinmetz. Ethan is the CEO at Portable while John is the VP of Analytics at Gallo Mechanical. During the conversation, the group explores the intricacies of data work including a discussion around the distinctions between data teams and product/engineering teams, emphasizing the importance of aligning data initiatives with business goals. Ethan and John share their career journeys and insights on building effective data teams, the concept of data contracts, the analogy of data management to plumbing, the importance of self-service analytics, the need for data teams to be seen as profit centers, and so much more.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 00:06
Welcome to the data stack show.
John Wessel 00:07
The data stack show is a podcast where we talk about the technical, business and human challenges involved in data
Eric Dodds 00:13
work. Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. Welcome back to the data stack show. We have two guests today, Ethan Aaron of portable and John Steinmetz of Gallo mechanical. Gentlemen, welcome to the show.
Ethan Aaron 00:39
Thank you so much for the conversation. All right.
Eric Dodds 00:42
Well, give us just a quick background. Ethan, why don’t we start with you?
Ethan Aaron 00:46
Totally so I’m Ethan Aaron. I’m the founder and CEO of portable. I’ve been working in data for almost a decade. At this point, I have been the head of data, a small startup at a 1000 person company, and now, for the last five years, I’ve been building data integrations so that data people don’t have to worry about extracting data from systems and centralizing it into their warehouse. So we have 1500 different integrations. I’ve built hundreds of them, almost 1000 at this point, so I can speak to all the different nuances of this ecosystem.
John Steinmetz 01:16
John, yeah. John Stein nuts, right now I’m ahead of data over at Gallo, I’ve implemented three data teams from scratch for startups. Led some of the bigger teams over at Expedia, Home Away and zvoice. I started out as an engineer, worked my way up, and decided to move to product. Now, I moved to a CTO role, where I would be administrating over product data and engineering, and now, primarily over the last five years, been working on startup data and focusing on that. Recently worked for Shift key, a startup that eventually and is now probably about two and a half billion dollars close to that. And now I’m taking my talents, if you want to call them that, to gala mechanical, to try to change the construction industry, because that is a very underserved data industry. So yeah, that’s me. So
John Wessel 02:04
guys in our rap here, a few minutes ago, we talked about data and engineering teams and some differences specifically around product so I’m excited to dive into that, talking about data product people versus product people on the engineering side. What are some topics you guys are excited about talking
Ethan Aaron 02:21
about that problem in terms of the similarities and differences. I think there are a lot of differences between data teams and product and engineering teams, and then also thinking about the nuance of that when you’re at a one person data tee, when a company that can afford a one person data team versus a company that can afford 100 person data team, because it changes, just like engineers, a one person startup with one engineer is very different from a company with 300 engineers and how you have to trade. So I’m excited to dig into that as well. Yeah.
Eric Dodds 02:53
What about you? John,
John Steinmetz 02:54
yeah, very similar. I think that you know, data is all about doing what’s right for the business from a value perspective, and with any engineering task, if you don’t have a business goal or business lead into that, you will eventually waste a lot of money so into, you know, tying all that in. I always run all my teams like a product. It’s got an engineering side, a product side, and a design side as well. So you have all of that in there and leading to that. You know, business value is really critical, and not all companies are the same. So you got to kind of figure out, like, what does it mean for one person, like, like Ethan said, versus, you know, what does that growth look like? What do you need right now? Versus, you know, what do you need later? And making sure you don’t spend a lot of money up front and the product side of that really drives that home. I love it. Well,
Eric Dodds 03:40
tons to talk about. Let’s, let’s dig in,
John Wessel 03:42
right? Let’s do it all right, both
Eric Dodds 03:45
If you have really interesting careers, you know, Ethan, you started in banking, and John, you started in, you know, sort of, let’s say traditional software engineering, as a software engineer, and both have ended up in the world of data in different places. So could each of you just give us the couple minute version of your story? Where did you start in the world of data, and what got you into it? And then, you know, how did you end up where you are today, we’ll switch it from the intro. So John, why don’t you, why don’t you lead us off? Yeah, so
John Steinmetz 04:18
This is great. I actually started out as an engineer for more of a marketing side, where I’m building, you know, probably 2030 applications a year for various brands. I loved it. I love the engineering side, love the design side. That’s actually my degree is actually a graphic design degree. Oh, wow, yeah. And then I realized very quickly that the other side of my minor in college was sociology. So I realized quickly that I loved how groups think, and that was a really important aspect of what I was doing. And then I realized, as I was building these applications, that there were nuanced ways of building them for each different brand according to what goals they had. It, and then I saw all the marketing data behind it. And then what really inspired me to make that switch into more of a product slash data role was when I saw the effects. It was at the very beginning of social media, right when I saw that just some small differences that you can make in these applications, they would lead into big things on the other side, value for the businesses. And then I realized that’s really where I wanted to put most of my time and effort. And that led me to leading data organizations, organizing them in such a way so that everything rolled up to the company’s goals, not just the group’s goals. I have a personal mission, so that data no data no longer becomes what you see in it, which is kind of like a we have to have this mentality of a cost center so we need this as a profit center for us. Like, I want to move data into that space, as opposed to what most people do is they just hire a data team because they need it. And to I say, you do. And the reality is, we need to be the book ends of everything, at the start and at the end. And that’s really what my mission is from, from where I was to where I am today.
Eric Dodds 06:10
And you did some time at some really large companies. You know, both established companies. You were Expedia. You managed the home page team there. Can you describe just a little bit of the pressure of that? Because, you know, I don’t have direct experience with the home page that large, but the little experience I do have is, you know, when you push to production, it’s a big deal, because it can mean, you know, if you screw something up, it can cost the company a huge amount of,
John Steinmetz 06:41
Oh yeah, and I’ll tell you, like being at Expedia, and it was Expedia home away, so it wasn’t the big one, but, like, we integrated with Expedia, because all of the data is shared between those organizations. Interesting. So on my first day at Expedia, I was told, You’re, you are going to be presenting to all 10,000 people in the company. My first day, I had never talked to anybody, never I was like, All right, what do I need to do, rolling my sleeves and doing it with so much pressure, yeah, you know, and there’s a guy over there, well, he’s not over there anymore. He’s not with PayPal, John Kim. And I think I learned more in the time I was at Expedia on how to do data for leaders from listening to him give me feedback in those big product meetings, because these were all televised, uh, over the internet for the entire company. So I was literally speaking to not only them, but the questions he asked and the way he presented those questions really got me thinking about it. It was the first time I had ever heard of okay, ours and KPIs, right? And I thought I knew everything right, like when I got in there, so the pressure was there, right? But at the end of the day, Expedia is and most big companies like that, their process is essentially a workflow. The home page leads to the next page leads to the next page, and your job is to pass the right amount of people in the right ways to the next team. That was it. And if you think about it from a manufacturing perspective, that’s a phenomenal way of thinking, because you’re not worried about the whole thing. It takes the pressure off the second tee. The pages that came after that page told me what they needed. And then I had to sign my systems to say, let’s figure out the best way to send the right people in the shortest amount of time to that next page. And then they did their thing for the same thing, right? I mean, we’re talking millions of dollars of impact, especially at the home page, because I was the tip of the spear. If I didn’t drive the right people to the second page. Everything’s off, yeah? So yes, and it was a very data science heavy role, very data scientist. That makes total sense, yeah. So it was fun, it was a learning experience, and it was really good for me to learn how big companies do it, because then you can parse that out into how smaller companies can also drive that impact as well,
Eric Dodds 09:02
okay? And then contrast that with Gallo, because, you know, that sort of experience is, you know, let’s just say, bleeding edge digital, you know, almost purely digital journey, tons of traffic, tons of SEO implications, you know, just the nice edge, if you will, or the sharp end of, you know, a digital funnel. And now you are doing data in the construction industry. And those things are, those are different. So just love to hear a little bit about the contrast there different, but
John Steinmetz 09:38
The challenges are still there, right? The challenge in an industry like this is and Ethan could probably talk about this too. The challenges in the finance industry 10 years ago is what construction is doing right now, they don’t necessarily have the same systems. They don’t necessarily have people like me with the experience that come into the construction because there’s no there’s not as much money to be made. In that space. So it’s typically, it’s a different kind of situation, but the foundational elements are all the same, right? You have to have a data warehouse. You have to centralize all your analytics. You can’t tie in directly to these systems, because you can’t affect production like all of that stuff is very similar. I would say the one biggest difference is, most of the stuff I’m doing today is one way. It’s all read only, whereas in these other systems, I’m pushing data back into these platforms. Right, right? So that’s the biggest technical difference, yeah, but from a business perspective, it still goes, what are the company’s goals? What are you looking to do, and how can I provide those now, and how much we’re going to talk about this today, but there is a method I go through which is determining a company’s analytics intelligence, and I have a very specific formula that I use for that, and that helps me determine what path the data culture needs to follow. You know, companies like this here at Gallo, you know, very smart people, but we’re still stuck in Excel, yep, right? Whereas some of these other companies use Excel as a sandbox tool, but never as a production tool, yep. So you kind of have to, like, figure out what’s important to the business, and how do we get them away from Excel and trust the data. So that trust, I would say it’s more difficult to establish trust in an organization like this than it is at Expedia, because the trust in that in our big data world comes with the territory, right? It’s accountability, instead of trust, like you have to be accountable to the things you make, whereas here, and I’m not saying this as a detriment to this company, it’s not, but in companies that aren’t served by data very frequently. Yep, everything is an amazing thing. Sure, sure, right, yeah, so that’s a little harder to kind of come back,
Eric Dodds 11:50
yeah, no, I’d love to talk about the analytics framework, but Ethan, okay, give us your backstory, and then we can dig into the juicy stuff totally. I mean, this is already juicy, but I mean the top ends, my
Ethan Aaron 12:01
The background is kind of all over the place. So I started my career at Goldman Sachs doing real estate investing. I was buying properties, office buildings, residential units, logistical warehouses, that type of stuff. But I found myself getting more excited about the spreadsheets. And how do we write VBA code, or how do we restructure these to be more efficient? Or how do we streamline a process? Or how do we replace light bulbs in a building, all the operational stuff? I also hate authority, and I hate being told what
Eric Dodds 12:26
to do. So I just say that sounds like a great cultural fit and heavily regulated entry. Maybe not the best.
Ethan Aaron 12:33
Yeah. So a couple years in, I went to a 12 person data startup, and I was supposed to do sales, and again, I knew nothing about any I didn’t know about sales, I didn’t know startups, I didn’t know data, but it sounded like an amazing place to learn. And I got there day one, the CEO was like, actually, we don’t need sales right now. We need someone to build dashboards and implement customers. Do you know SQL? Do you know shell script? I was like, No, but I’ll figure it out. And it gave me a really interesting perspective, because it wasn’t, can you go learn SQL, and can you go learn data tooling? It was, I need you to know this, because you need to build the stuff. I need to run this company. It was very much in service of running the company. It was not in service of learning SQL. For the sake of learning SQL, they made a product, sold some data, and then we got acquired by a live ramp. I did Product Management at a live ramp for about a year. So I’ve been on the data side, but I’ve also worked with engineers at small companies, large companies, and now back at portable. And then I stood up the data team at the live ramp. So we were a 1000 person company. Did not have a centralized data team. I made a case for Hey, we should have a centralized data team. Here’s what it should look like. Started interviewing all of our execs, what matters to the CRO, what matters to the CEO, what matters to the CMO, and coming up with our Global list of KPIs and OKRs that we can actually measure with data. I did that for about a year. We sold off our parent company for $2 billion and I went and I worked for our Chief Strategy Officer at the library, trying to figure out who we should partner with, or who we should acquire. So I spent a lot of time digging into data integration companies, small companies, big companies, and it gave me a very good understanding of the ecosystem of ELT tools, ETL tools, CDPs, and reverse. ETL wasn’t really a category yet, but I pass tools, and about a year into that, I was like, I can do that started portable, and what we’ve been doing for the last five years, our goal has always been built 10,000 integrations so that data teams don’t have to we want to build a platform on which we can build and maintain 10,000 integrations that pull data from systems and put it into data warehouses. At this point, we have 1500 integrations. I personally probably read 1000s of sets of API documentation. I’ve probably built 750 integrations myself at this point, and I’m also the data person at portable. So in addition to building integrations, being the marketing face of portable, doing sales, customer support, I’m also back to the lens. If I’m the CEO. I need to run this company. What data do I want at my fingertips? And I’m doing that, but I’m not doing it for the sake of data. I’m doing it because I need to run the company and get an interesting perspective on all of this. I also talk to data people all the time. I probably have 15 meetings a week with heads of data, small companies, big companies. I host events in New York and really big events at snowflake summit every year but I’m excited for the chat today. Yeah,
John Wessel 15:28
so one, one quick observation before we move on. I don’t think I’ve ever heard anybody get hired into a sales role and a CEO goes and says, Hey, we don’t need more sales people. I want you to do data stuff. I don’t think that it’s ever
Ethan Aaron 15:43
it wasn’t. It wasn’t, we don’t need more sales people. I was going to be the first sales person. The funny part about that, though, is when I got hired, it wasn’t me being like, I have a sales background. I can do that. The CEO was really looking for passion. And just like, are you willing to work hard? So the entire conversation, my my interview with the CEO of this company for about an hour was just us talking about building efficiency, light bulbs, energy efficiency, all the stuff I was working on at Goldman, the things I got really excited about, I went very probably too deep into but I got hired because of that. I liked it, he said, You need a sales person. I said, I could do that. And then when he said, I need a data person, I was like, I can do that. Like, when you join a 12 person company, like, I would assume you’re signing up for whatever needs to be done, whatever.
John Wessel 16:30
Okay, yeah, okay, yeah, you got hired based on, like, Hey, this is the right guy. Like, we’re gonna, we’re gonna slot him somewhere you got initially slotted and just like, move positions, basically, that makes sense.
Ethan Aaron 16:41
Yeah, I always like, as a founder now, I view it through the lens of like, that’s probably a really good way to screen like, it’s too late, like you used to do it beforehand, but it’s like, probably a really good way to screen out people at small startups is like, you thought you were going to do this day one. I’m actually okay, let’s do it.
John Wessel 17:01
Yeah, it saves you some time.
John Steinmetz 17:05
I’ll say this. Most of the unicorns that you see all start out like that, where you have a very small group of people that do multiple things. I’ll say this about the Shift key. When I was hired as the I was is in a CTO role, and I left to go take product role, very low level product role with Shift key, because I saw the vision, and they knew all the things I could do, and within I was the first product hire, and within three weeks, they were like, we just want you to do data. Yeah. I mean, it’s all the successful startups I would I’ve heard tons of stories like
Ethan Aaron 17:37
that. Yeah, it kind of reminds me it’s your background. You’ve like, at rudders, a bunch of different things. And it’s like, so it was a hard pivot day one from like, look, I didn’t know anything about sales and I didn’t know anything about data, so it was really just like, which am I gonna
Eric Dodds 17:53
give up? Via one? Yeah, totally. We actually had someone from Braze on the show, a gentleman named Spencer Burke, yeah. And he has been at Braze, I. It’s 11 years, 12 years, which is a really long run at a startup. He was, I mean, I think they were just a couple of people. He was an employee number, like, one or two outside of the founders, yes, yeah. And I think, I mean, he was, like, running around New York City, just trying to see if people would install this SDK in their mobile app, you know, like all these local startups anyways. Great episode, and like, really good story like that, playing out over a long period of time within the same company, because his role has changed really significantly. But, yeah, so, so, so, so, so, so great an episode, if anyone’s interested, okay, we’ve already had such a fun conversation, but let’s dig into our first topic, and I’ll lead it in by mentioning an article that bubbled up on Hacker News this week. It’s a great article called how I ship products at large companies or something of that nature, and a really interesting article with a number of just, you know, generally good pieces of advice. But one of the points that the author makes that really stuck out to me was how you measure success for a product. And one of the points that he makes is that one of the success criteria needs to be like, buy in and excitement from your boss and from management on what you ship. And he even goes so far as to say, which, I loved the provocative nature of this. He said, You’re probably thinking like, oh, you need to measure, like, whether people use it as well. And he was like, actually not true. Like, if I like, it’s about management buying into and getting excited about what you’re shipping, right? Because if they’re not doing that, and it is, like, numerically successful, it still doesn’t matter, right? You know, people have different opinions on that, but I loved the point that he made in drawing. A really hard line on you can measure things in different ways, but there’s really only one or two things that actually matter. And when we were chatting before we hit record, both of you made that point around the business value of data. And I mean, Ethan, you even alluded to this. You know, you’re doing data stuff at portable but only to serve extremely specific needs. You know, the specific needs of the CEO, which you know also happens, but yeah, John, why don’t you? I’d love to just hear you talk about your experience with that in Yeah. Just give us some insight into that dynamic around, around business value?
John Steinmetz 20:40
Yeah, I think that article probably is, has probably some merit to it. When I get in, my number one thing is to create excitement, right? Because when you’re creating a data culture, especially from scratch, that business value leads to excitement a lot of times, most of the time, those two things intertwine, because business leaders get really excited when they see the numbers going up in a very strategic way. But they’re also intangibles that when you have discussions with these business leaders, especially at the C suite levels, like, what do you care about? What is important to you? And I’ve actually heard C suite members say, I don’t care what the financials are right now. ” I want to get people talking about our product. I want to get people understanding that we have made a mark in this industry or whatever. So that gets my wheels turning right. Like, what is it? What can I do from a business perspective that will show those things right? I preach to my teams, while not here, but like my previous teams, right daily business value. And they will tell you, I say it every single day, what have you done today to provide value to us? Right? Some days you have better than others, and that’s okay, but like you should always strive for that. And by business value, I mean, what did you deliver that someone is using? Right? It doesn’t matter how many dashboards you deliver. It doesn’t matter how many APIs you create. It doesn’t matter any of that if nobody’s using it, it holds zero value. And that’s why I also teach my teams, the faster you can get something to production, even if it’s not perfect, the better it is for you. Because if you spend six months planning something, and you push it out there, and people go like, What the hell am I looking at like? You’ve just wasted six months of company time, but if you put something in, and you build it, and you get it out there in two or three weeks, even in a partial spin, and you start getting feedback. Feedback is business value, because you are now learning what the business needs to be successful. So it really is that simple to me, generating that excitement comes from really finding those champions within the business, getting those people excited about what you’re doing, and that usually leads to more money for your group, too. So always try to put that in place and try to make people understand that, you know, value, for the most part, equates to dollars, but it doesn’t always have to. It could be just that excitement. Yep, can
Eric Dodds 23:03
you speak to both the listener and me? Because I’m guessing that some of our listeners are having the same, you know, the same reaction when they hear that, and the idea of daily business value sounds great, but in the back of their mind, they’re thinking, things are pretty complex. Here, where I work, there’s a lot of technical debt. Shipping stuff is hard, probably for some reasons that aren’t that great, but also for some reasons that are legitimate, you know, maybe from both the technical and cultural perspective, and so shipping daily business value, I think for some people may feel like a really challenging thing to do. So can you speak to those people and maybe just give an example of if I’m feeling that, like, how that sounds so great, I just don’t know how that would be possible. Help us think through how we can do that and like, you know, go into work tomorrow and try to create some daily business value. It’s
John Steinmetz 24:04
simply make a list of everything you see on a daily basis. That is what your business or your teams are struggling with. Could it be lack of data, definition, documents, lack of understanding of certain things? Well, you know what? Carve out some time today instead of doing development work or building out an insight. Start trying to model out things digitally. Go into Miro creative, a workflow document that generally determines what your business is doing in this particular workflow, instead of just assuming and building lay it out right like do those types of things. It doesn’t always have to translate to shipping code. Sometimes it could be, I mean, it was an easy one at Shift key when I got in there, you know, they had already had some processes in place, and I said, You know what? Yeah, I’m going to spend some of my time building this out. But if I just went. Talk to a few people, I could really, from a business perspective model, out the path from sales all the way through, bi like, all the way through, and then write it out as a business document. That’s the thing that most people miss when you’re building out teams to do things like stop thinking like a technical line. That’s so much more to what we do than just thinking, Oh, we’re building a workflow, right? Yep. Ask why, what is happening here, and uncover problems. Just because you can’t solve them doesn’t mean you haven’t found business value. You’re uncovering problems, which is what a dashboard means. So that’s what I would say to those people, is, don’t get overwhelmed by your actual job. Get overwhelmed by everything. Put your business stuff in place.
Eric Dodds 25:52
Yep, I love it. I love it, all right? Ethan, do I need to re ask the question. What was the question? Oh, focusing. Yeah, yes, the data. Okay. So tell us about the interactions between your data person and your video and the data person having to prioritize, you know, the core things that the CEO needs. Yeah,
Ethan Aaron 26:14
so I think there’s a few different things to say on this. Number one, for the last like, four years, I keep saying, I keep going to people being like, focus on business value, not infrastructure. Focus on business value, not infrastructure, which I think for the last four years was correct. It was like you had gone from a one person data team to a 10 person data team, and now you’re using 10 times as many tools. Like, we enjoy it like that was not the right thing to do. You should focus on helping business stakeholders. I think now that most teams have realized that the 10 person teams have shrunk, the people that are really great at what they do have expanded. But it’s like, I think the term business value is no longer this is going to sound crazy, because I use it all the time. We’ve been talking nonstop. It’s like, I think you have to go one step further than that to actually think about what that means. And I think what John said is very much on point. Business Value is not always revenue goes up and costs goes down. Like, if you do those two things, generally you should be okay, but like, that’s not always aligned with what matters to a company. And as the data person at portable, my CEO is the perfect example of how that’s the case. Like as the CEO and the data person, if I had another if someone else was the data person at portable, they would look at me and be like, all the stuff we did two months ago, you threw it out, and now we’re starting again on something new, and then two months from now, it’s like we just threw all that out, and we’re starting again on something new, like, Why did our priorities change? And our priorities are changing every two months, because we got 80% of the answer to a question, and that was good enough, and now we need to move on to the next question compound on each other. So I think to John’s point, it’s like business value is better than infrastructure. Focus on business value, but when you think about business value, think about it through the lens of getting as close to your leadership team, your board, your CEO, your C suite as possible. They are priorities. They could be KPIs, OK. ARS, just like personal goals and find a way to help them with that. So like adding business value, adding business value daily. You don’t have to like to a certain extent. You can ignore data. I view data as a skill more so than a role, and you’re really great with data. You can go to a marketing team like, I’m gonna have value that you didn’t even know you could accomplish, and that when for them, it would take six months or they don’t, they might not even realize it’s possible. So like, thinking about your job through the lens of what are the strategic priorities at any point in time, they are not always revenue and cost. Like, sometimes it’s brand awareness. Sometimes its new logo acquisition, even if it’s not money, figuring out what those are for execs, and then going to be like, I’m going to help you accomplish that goal faster. Is where I think we now as an ecosystem, need to move the conversation. We need to. I think we’ve moved it now from stop focusing on infrastructure, focus on business value. But I think, to your point, Eric, like, there’s a lot of listeners, a lot of data teams out there. They’re like, cool, I hear that business value. But like, what do I do? Yeah, my personal recommendation is always, find the leader. Find the top 10 leaders in your business. Ask them their top three biggest pain points: leaders , the goals that they are being compensated for, the goals they have to report to the board, to the CEO and find one of those 30 goals, three goals per 10 people that you can make an outsized impact on, and just do it. That’s my macro take on business value. I actually think I use it a lot, but I think it’s the wrong term. Now, I think people get that, and now we have to move on to the next one. Next one is, how do you actually align yourself with the strategic goals of your company? Impact them.
Eric Dodds 30:10
I love it. I love it. All right, John, you had, I monopolized the conversation, but you had some really good questions around the product as it relates to data and some of the differences there. So what? Yeah, I’m
John Wessel 30:23
excited about this, yeah, yeah. I think, yeah. I mean, great discussion on the business value. I do have to say that I agree, and I actually really identify with John. Like I went from this transition of like I’m a deeply technical person, data person, into an executive role. I had the ability to, like, do my own data stuff for myself, but also had a team. So like, as I, you know, as time went on, my team did more of it than I did, and I got sounds like somewhere to John got really obsessed with OKRs of like, all the way down to the team at my director level, just to bring clarity of like, what are we doing and how are we going to measure it, and really trying to align around that. So I very much identify with that. So okay, so solution gears into this data, Product Engineering, product thing. I’m
Eric Dodds 31:13
so excited about this conversation, yeah, with
John Wessel 31:15
Eric, head of product here, I think to start off with, I think for both of you all even maybe I’ll take, I’ll have your take first, just your thoughts on let’s just start with differences. What do you feel like, fundamentally, if you’re a data engineer versus a software engineer, like, what are the crucial differences? Like, they’re both technical roles. Some of the skills can kind of overlap. But what are those like, crucial differences, just between the roles, without talking about the product yet. Let’s
Ethan Aaron 31:42
take a typical software company, so SaaS business, I know this is going to change depending on the organization. Well, let’s say take a typical SaaS business. Let’s say you’re a 100 person company in that SaaS business. 50 of the people could be software engineers in a 100 person SaaS business, maybe you have three data people. Maybe you have two, maybe you have one. I would say that’s
Eric Dodds 32:02
probably the reason. Maybe none. I
Ethan Aaron 32:06
was in a company that was 1000 people, and we had no centralized data people. But what you have to realize there is in a similar size company, the relative software company, this is going to be different. In something like construction. Might not have the software engineers, but you might have the data people. You need to realize that when you compare your team structure, your responsibilities, your roles in your tech stack and your approach to development, you’re not comparing your one person data team to your 50 person software engineer, your one person data team to a company that has one engineer, right? Even the idea of a state like that. So if you have one data person and they call themselves a data engineer, I think even that is wrong. I would just call yourself a data person like you can’t differentiate between a data engineer, a data scientist,
John Wessel 33:02
like, you’re just an analytics engineer. Yeah,
Ethan Aaron 33:06
you do not have the other people, so you were just a data person, just like, like, portable for two and a half years, there were two of us. We had me as the CEO and my CTO, he was our one engineer, and I was everything else. And when you think about the tooling, the processes, the way you have to think about the world, when you have one engineer, it’s so wildly different than when you have 50 engineers. So a lot of data people today are drawing these analogies to product do engineering orgs that are 3050, 100 people. There are less than 100 companies in the world that should have a 50 person data team. It’s just not like that is such a massive investment in just data that that’s 50 people, let alone the engineering orgs of 1000 or 10,000 people that have to be shipping really complex things. So like when you think about drawing analogies between the two, I think the best analogies and the best framework for data teams to model their tooling processes, software development life cycle is startup engineering, org One day with one engineer up through 30 engineers, what look at how those teams are constructed, how those teams ship, how they focus on what’s production versus what’s just get it out the door fast. And unless you’re at Google, meta, Amazon, open, AI, your data team should never be looking at an engineering org that’s bigger than 50. 50 people and saying, I need best practices from that. So that’s I, my perspective on a lot of the analogies is, there is an analogy there. It’s just not the analogy most people think it is.
John Wessel 34:54
Yeah, I yeah, I tend to agree on that. I think the one thing that will be interesting, and this. I mean, that makes a ton of sense for SaaS, but SaaS is in the business of building software. So of course, the people that build the software are like, a big chunk of the head count, and data is more supporting that initiative or effort. But like, do we see more and more companies get into the business of like, data is what we do. We do like, I don’t know, benchmarking or something like, company like, company, like, all we do is ingest data from a bunch of different places, and, you know, like, we’re basically just in the business of data, you would, you would think in that scenario, like, maybe, you know, maybe you have more of that, like framework where you’ve got product data, product people and data, you know, data people split out into multiple roles. However, at that point, you’re still kind of just mirroring the, you know, product and engineering, like, data is just like it’s all like, they’re engineers, like, it’s, it kind of doesn’t matter, right? So, in anyways, John, I want to get your take on this as well.
John Steinmetz 35:55
Oh yeah, like, this is prime for what I love to talk about, right? And I’m going to take a little bit more of a technical approach to this. First and foremost, I would say that data is what most companies are doing today. The software is just something that moves data or allows entry of data, like Netflix, you would think, Oh, it’s a streaming service company, but you look at their stock price, it mimics the amount of data flowing through their system. Like, I mean, those kinds of parallels everywhere I look at the difference between data engineering and software engineering, and I’ve done both. So I can speak to this a little bit more closely. Think about building a house. You have very defined specs. You have very defined rules for what you’re building on a house, right? That’s software engineering, right? That’s you have everything, things that have to be done a certain way for it to work. Data engineering is more like, how does the water move through the house, right? What temperature does it need to be over here versus over here? Hot, cold, like there. There are nuances to the workflows, but you’re essentially moving data throughout this software that’s there. So fundamentally, it’s very different, because you do only have a couple of people typically focused on that, and there, there are more ways the precision of software that it needs to because you can’t have anything break, you can’t have anything that doesn’t work, we’re in data. It’s a little different, right? Like you’re looking at data, you’re looking at the outputs, as opposed to the entirety. So you have a little bit more flexibility. I find data engineering is more configuration than it is anything else. At this point we have, as a data engineer, we’ll have 10 tools to build our stuff, whereas a software engineer might have one, right? They build structured software, they put it in JIRA, they push it up, they get a peer review, like there’s a specific thing that is there. Whereas us, we can choose portable, we can choose a different tool. We can choose to have a myriad of tools that have no bearing on the rest of the company. They don’t care. It becomes a business decision. Do we want this little thing? Do we want to buy it? I mean, it’s truly a product to your point, like it, it is truly more product driven than, say, software development. So that’s the way I kind of look at the difference between a software engineer. I mean, you look at look ml, or you look at these middleware languages, right? I mean, it’s just basically all the same stuff, just a different flavor, with different features. And you have to learn in software engineering, if you’re at, say, you know a Java developer, you know Java, and you’re good at it, and that’s what you build in, whereas in our world, I mean, I may choose R for a project, I may choose Python, I’m choosing the right tool for what I need to build, for what it can do. So there’s some I think we have, as data engineers more flexibility in our day to day than say, a software engineer would.
Eric Dodds 38:50
That’s really, that’s a great point. John around, especially with some of the new architectures and some of the new, I mean, we talked about iceberg a couple times in the show. Other technologies like that actually are really accelerating this ability to say, you know, you can actually pull this apart and kind of use whatever tool system, or, you know, whatever tools you want, and you don’t even have to move the data. You know, it’s really interesting from that standpoint. Whereas your software is built in Java, you know, or go, or it’s like, okay, I mean, yeah, that’s guess what software engineer, like, we hired you. You’re gonna write rust, you know, or whatever. So, yeah, that’s a super interesting point,
John Steinmetz 39:34
yeah, but data contracts, right? So just to that point, data contracts are essentially what allows software to interact at those touch points. So you think about something like micro service strategy in software development, you know where that came out of looking at how data did their stuff and being as flexible and simplifying it so that there’s a data contract between two pieces so you could interchange and that’s why most companies are moving to a micro service. Strategy, because it says, Hey, I’m going to use Mongo over here. Oh, I’m going to use Postgres over here, because some databases have better functionality for operational purposes, instead of using a monolith. Now they’re moving to smaller ones and that, in my opinion. I mean, some people might just argue about this, but I think they got those benefits from the data world. They saw what we were doing in those workflows that we could interchange and inject and move things through, and now all of a sudden, they’re doing that because they realize technical debt was getting to be such a burden and having to restructure every four to five years of rebuilding their entire platform. Now they don’t have to do that, because everything’s an API. I’m going to collect data from here, I’m going to send it over here, and those data contracts are so critical.
John Wessel 40:45
Yeah, yeah. I really like your analogy of the water and the plumbing in the house. In my previous gig, we were in the specialty plumbing and water industry. And I’ve thought about that analogy a lot, right? That’s right, yeah, I thought about it a lot, because it’s actually really complicated. There’s the basics of, like, you don’t want it to leak, right? That’s like the basics. But when you get into, like, what region you live in, and what your water source is, and what you want to filter out of the water, and then you’re getting down to parts per million, you’re not ever, like, getting perfectly pure, very rarely do you have, like, perfectly clean, pure water, like distilled water, like nobody it tastes gross. Nobody wants that, typically. But when you’re going through you’re like, all right, I’m gonna get the like, chlorine down to this level and iron down to this level. And like, that’s so much more similar to data. As far as, like, we’re not ever going for perfection, we’re going for like, this manageable, like, level of accuracy. That’s the right level of accuracy for the business decisions that have to be made on the data. And that’s so different than like, hey, we frame this in we use the exact like spec screws and wood and drywall, you know, et cetera.
Eric Dodds 41:51
The big takeaway there is, if the CEO says, I want this report to be 100% accurate, you now have license to say nobody wants that. It tastes gross, exactly
Ethan Aaron 42:05
like so. So the slightly different take on the water analogy is also like data in most scenarios, is an internal facing well, like your job is the CEO or the CRO the CMO information to do what they are to do their job. So let’s say you have an office building. It’s like if you know that 90% of the people in the office building all work on the second floor, not the first floor. You need more faucets, you need more toilets, you need and you need to have them dispersed in the floor in a way that actually makes it so those people don’t have to walk down two flights of Stalis back, because now you’re just wasting everyone’s time and you’re not doing your job. So it’s a cool job as a data person to not put 1000 like faucets in the corner, like you have to be all next to each other. It’s like, what’s the right number of faucets, given the positive, is
John Wessel 43:03
Is it a faucet, a dashboard in this analogy?
Ethan Aaron 43:07
Okay? But it’s like, that’s a really interesting point, because it’s one of those things where, if you have too many you just spend way too much money, way too much time, stuff that’s not being used. If you get a company like, now they’re like, everything gets delayed, and it’s like, so, like, I think a lot of data teams think about, like, the, ooh, is it a shiny faucet? Is it not a shiny faucet? What’s the parts per million? When, in reality, they need to take a step back and be like, what are the people doing on this floor? Yeah, yeah. Where are the guests? Where are the meeting rooms? Like, when do people have lunch? Like, no, that looks like we need a faucet in the kitchen. And it’s like, I think it’s difficult for data people in a lot of companies to watch everyone for a little bit of time to figure out
John Steinmetz 43:54
faucet observe. Observation is probably the number one thing I could tell a new data person coming into a company, observe everything,
John Wessel 44:06
yep, yep. Then into John’s point too. One more. I love the plumbing analogy, so like this one, but this
Eric Dodds 44:13
is like a juicy new post. You know, 1500 words on data
John Steinmetz 44:18
status.
John Wessel 44:19
I’ve thought of this post like it’s in my head somewhere. But there’s the other thing with the tooling that you mentioned again, and and, like Plumbing world, you’ve got a point of use, which is, like, I’m going to do an under sync filter, or, like a refrigerator filter. You’ve got the whole house, like you can do your whole house. You’ve got the municipal level where you’re doing, like, a whole city. And data, I mean, data can really be somewhere where you have this big, centralized team, they’re doing everything together. But maybe that’s right. But sometimes it’s like, hey, we just need a little point of use, like refrigerator Filter, like, right here, and IT services, you know, five people, and that’s great, and we would have way over built if we made the change up the municipality.
Eric Dodds 44:55
In the next episode, we’re gonna we’re going to. So we’re going to complete this analogy by figuring out the data corollary to the guy who wheels in those big tanks on a hand truck and puts the water thing on and it’s not connected to anything.
Ethan Aaron 45:13
Goodbye. But my favorite part of this whole analogy, and then we can, I guess we can get off the water analogy, or we can say on it, or I self service analytics. I think most companies go way overboard and build things that everyone can do, and that is a very controversial take, but from a water perspective. So taking the analogy one step further, I think what you should do if you’re building plumbing for an office, for a floor in an office building, give people a faucet, let them fill up their glass and bring it back to their table. You built the faucet. They can use it. They can go drink the water at their table, or buy a bevy machine. Let them pick their six different types of juice, and they can walk it themselves to their desk. Self space analytics is like putting a miniature faucet on everyone’s desk. I think that’s stupid, like you don’t know how many people are going to be there. Like, you don’t want to have too many faucets. You don’t have too many or too few faucets. Like, pay a little bit more for 12 packs of Diet Coke that sit in the fridge. Let the people deal with a soda dispenser on everyone’s Yeah,
John Steinmetz 46:18
kitchen, yeah, what’s
Eric Dodds 46:21
worse is that the faucet at everyone’s desk has like, 19 knobs that change things, and, you know, they don’t really know how it works, but, and everybody’s
John Steinmetz 46:30
making decisions about the temperature of the water, and they’re doing different temperatures. I love it.
Eric Dodds 46:36
I love it. Okay, we have time for one more question here. I feel like we could keep going for hours. We have time for one more question. Okay, I’m gonna break the analogy, you know, so I’m so sorry. But one of the things, like, if you think about it, like the, you know, the pipes and the actual hardware that are like interfaces for water, in some ways, those are much more similar to software, in that, you know, they need to reliably handle scale. They don’t, you know, they need to be robust. And they operate largely the same way every time. You know, in an ideal world, right? And data is pretty different than that, in that the products and Ethan. What made me think about this was you were saying okay to we were starting over two months ago, because we got 80% to a question, right? And so the product actually looks, can look very different, you know, at different phases right now, okay, maybe you have, like, a KPI dashboard that is durable, and there are some really good things there. But I’d love really quickly to get, actually, all three of your takes on this. So going for the triple threat here on the differences between the definition of product in software and in data. So John, why don’t we
John Wessel 47:54
start with you? Back to our discussion. I think of like you’ve got the team size thing, and I think it’s based what I would say is similar to what Ethan said, if it’s a very small team, we’ve got like one engineer, or if you have like one engineer and you have like one data person, probably at different companies, because if you only have one and like, they don’t typically scale like, like one to one, yeah, say different companies, one engineer versus one data person, similar, like, you have to Have some sort of at least slight product ability, is a software engineer or a data person, yeah, some sort of slight design ability, at least enough to, like, communicate out to people. And those are, like, more of your startup, you know, small company, unicorn type people, yeah, so I think it’s similar. Just depends on the size
Ethan Aaron 48:41
Ethan there. There are a few more, but I think there are three main ways people are building products with data. Number one is dashboards. It’s a way, and dashboards are really just helping execs make decisions, and whether, like, there’s two different ways to think about the dashboard. One of them is a pipe that’s continuously flowing water. It’s like not going anywhere. It’s going to be there for the next, yep, years. And then the other one is just a one off answer to a question, but like, I put those into the dashboard and insight bucket. That’s the read only use case. That’s just like, hey, I’m going to get you the insights so we can make the best decisions possible. Yep, yep. A strategy. The second one is workflow automation. This is a manual task where we have to move data from point B, and right now it takes 10 hours a week as a data team. You can use an iPad tool. You could use RudderStack. You could use anyone to take data from point A, transform it, put it into point B, and the goal of that is to remove manual tasks and production of things. The third one, which I’ve actually been seeing more of recently, I find fascinating, is marketing. I’m seeing more in-house data people either changing their roles or being hired into companies that have unique data assets and use them to create. Public facing insights about their own like data internally to create benchmarks like carter is doing a great job of this. It’s got Peter Walker over at card able to look across every startups, fundraising, et cetera, and he’s using it to create insights that then drive people to Carta and Matt Shulman over there, the CEO, they do compensation and benefits for companies around the world, and they have a unique data asset of benchmark salaries, benchmark benefits. And similarly, their data team is not building internal insights for strategic making. They’re not automating workflows. They are creating insights that show the world that they have either data products
Eric Dodds 50:44
internal versus external. Yeah, it’s a real external product, yeah. But those three
Ethan Aaron 50:49
use cases, I would stop most data teams down to that, and if you start bleeding into most other stuff, you’re either not in a bad way, but like you’re either a data team and a software development team, or you’re doing something else, or your marketing team that has really smart data people in it, and none of these are a problem. Just think about how your company is structured. But I think people with data skills, it’s those three use cases.
John Steinmetz 51:12
Yeah,
Eric Dodds 51:13
One quick comment on that that I think is a really important common thread is that each of those use cases have very well defined consumers. The middle one, you probably have multiple consumers, because you’re getting data into, like, a ton of different systems, right products, customer success, whatever, right finance. You’re getting data into the system. But each of those that you called out like, there’s a there is, it’s crystal clear that there’s a consumer on the other end, even for the external so that was, that’s great, really, like that. All right, John, you get the final word.
John Steinmetz 51:50
I agree with Ethan. I think that, you know, when I look at join so I’ve joined a couple of companies and started beta teams from scratch, right? Observing what the companies need and what the companies are building. There’s a couple of paths that I would have taken, or that I have taken, definitely, running things as a product, understanding that business component of what is expected of the data team that is primary. You gotta know. You gotta know coming in what the expectations are. I’ve seen it work, where company, somebody comes in and they really don’t even know what they’re supposed to be doing, because the business just says, We need somebody to do data. It’s very rare that person is successful in that role, unless they take that that approach of, hey, yes, I can do this, but I’m also going to be aware of what the company needs. Now, there’s also something we haven’t talked about, which is capitalization, right? Like, if you truly want to be somebody, as a data leader coming into a new company, you are immediately a cost center, immediately, right? So you have a target on your back of when cuts need to be made, you’re the first one to go, yep, right? Or your team is going to be primarily on that cut list, and they’re going to shrink it down. Now, the way you target and you combat that is, build one thing for a customer within the product, serve one thing, and then all of a sudden your work becomes capitalizable. Yep, yep. And what I mean by that, for people watching that don’t understand the business aspect of this is the government will give money to your company as a kickback, part of your salary and the work and the tools you build as as a as kind of like an incentive to to build more stuff, right? Like they incentivize that if you are building only for internal purposes, you are 100% not capitalizable. So start looking for ways to get in front of customers. Get your dashboards in front of customers, get your reports right. Build what I call exception reporting, so that you can create these things within the product. Instead of somebody having to look through 1000 things they know the five things they’re looking for, serve those up all of a sudden, now you could conceivably become sizable asset, right? That changing your mindset from purely technical to leading into product, you really to do the Why am I building this? Not just I need to build this, right? And being able to say no is a very important part of that prioritization component. Yep,
Eric Dodds 54:20
I love it All righty, well, we are at the buzzer, as we like to say. That was such a great conversation, and there were many things that we did not talk about. So we’d love to have you all back again soon. We need to talk about integrations. We need to talk about your analytics framework. John, so we’ll have Brooks find another time for us to have you back on in the next couple weeks. Love
John Steinmetz 54:43
It’s amazing.
Ethan Aaron 54:44
Really. Enjoy the chat.
John Wessel 54:45
Thanks guys.
Eric Dodds 54:47
The data stack show is brought to you by RudderStack, the warehouse native customer data platform. RudderStack is purpose built to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
To keep up to date with our future episodes, subscribe to our podcast on Apple, Spotify, Google, or the player of your choice.
Get a monthly newsletter from The Data Stack Show team with a TL;DR of the previous month’s shows, a sneak peak at upcoming episodes, and curated links from Eric, John, & show guests. Follow on our Substack below.