This week on The Data Stack Show, Eric and Kostas are joined by a panel of data experts to discuss everything to do with building a modern data team. From definitions to core components, this is an episode you don’t want to miss if you’re trying to start or improve a data team within your organization.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 0:05
Welcome to The Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You’ll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by RudderStack, the CDP for developers. You can learn more at RudderStack.com.
Welcome to The Data Stack, Show live stream. This time, it’s all about teams. So I hope I won’t disappoint you. But we are not going to talk about technology, or we’re going to try to avoid it. And we are going to focus solely on the people behind the technology. Incredible panel today. Really incredible. So cost us I think one of the one of the things that I really want to ask this huge, you know, repository of mindshare is about the best way to form a team, especially when data teams are sort of nascent, right, because it’s rare that a company, you know, just says, Okay, we didn’t have a data team. And then poof, like, now we have a data team that happens over time. And it’s generally iterative, right? Like, someone starts doing the job, and then the role grows, and then a team just kind of formed organically. And it can go a lot of different places. So I want to ask about their experience, seeing that happen, managing that process, and see what we can learn about how to do that really well. How about you?
Kostas Pardalis 1:32
Yeah, I want to ask them about the individual unrolls. Like trying to identify like how their position whether the, which one is like the backbone of a data team, we have so many dads and we are still like creating new roles, like, we’ve had like analytics, engineering, engineering, England, we’ve had the traditional data engineering, like we had in the past, like the DB admin, whatever. But like so many different roles. And I’d love to hear like above, like the dynamics and like what the structure of a data team and see how they understand Vixen, but also like, how you like what’s like the career path for each one of them, right? Like how lean become like a danger mirror and five or six different way to I’m like a data scientist or whatever.
Eric Dodds 2:23
Totally agree, well, let’s dig in and learn everything we can for these brilliant people about building data teams.
Kostas Pardalis 2:30
Let’s do it.
Eric Dodds 2:31
Welcome to The Data Stack Show live stream. This is one of our favorite things to do. And I have been looking forward to this one for a really long time. In part because we’re going to take a break from talking about a bunch of data tooling and technology. And get out what I think is probably the most important subject, which is people, the people behind the data, the people working on data. And we’re going to talk about that in the context of teams. And it’s going to be great. So let’s just do some intros and then we will dive right in. Paige, do you want to start off?
Paige Berry 3:09
Sure. My name is Paige Berry. I use she/her pronouns. And I’m currently enjoying a month of fun employment between jobs. Before I start my new role as a data analyst at dbt Labs, which I’m very excited about previously. Previously, I was a data analyst on the data team at Netlify. And before that, I worked at New Relic as essentially a full stack data person at a couple of different in a couple of different departments there. So that’s me.
Eric Dodds 3:38
Wonderful. Sean, you want to pick it up next?
Sean Halliburton 3:41
Sure. Hi, I’m Sean Halliburton. I’m currently Principal Cloud Data Engineer for REI’s consumer insights team. And prior to this, I was a staff data engineer with CNN’s data intel team. And before that, I built and managed the Clickstream data engineering team for Nordstrom, as well as the test and learn platform.
Eric Dodds 4:07
Awesome. And Sri.
Srivatsan Sridharan 4:09
Yeah, hi, folks. My name is Srivatsan Sridharan. I go by Sri. I’m also in between jobs right now but most recently, I was at Robin building and leading out there data infrastructure organization. And prior to that, I spent close to a decade at Yelp, where I was doing similar things data platform data engineering, the happen in the in the data space for about a decade as an engineer than as a manager. So really excited to hang out with all of you today.
Eric Dodds 4:37
Great. Well, let’s just dive right in. And one thing that we love to do on The Data Stack Show is go back and look at what we kind of call first principles around terms that we tend to take for granted. And one of those terms is data team. Right? And, you know, like all of you have worked on data teams, build data teams, et cetera, but it’s one of those terms where if you just ask someone on the street like, Hey, do you know what a Diddy TV is? They would say yes. It’s like, well, could you give me a definition and sit back and think about like, well, I can mean a lot of different things, depending on the company, etc. So let’s just start by having each one of you define what a data team is, from your perspective. And then we’ll dig a little deeper, and we can just go in reverse order. So Sri, did you want to kick us off?
Srivatsan Sridharan 5:25
Yeah, yeah, definitely. I think the debates still out there, right. I think different people perceive this differently. And different companies have the structure differently. But I think from my perspective, fundamentally, it’s a cross functional department that is responsible for making sure that the business moves forward based on data. And so I would assume that a data team’s primary responsibility is to make sure that decisions are taken in a scientific way in a data driven manner. And that requires a multitude of disciplines, from the people who are modeling the data from the people who are writing algorithms from the people who are visualizing the data from the people who are running the infrastructure that makes this data crunching and slicing possible. And so I truly do think it’s one of the disciplines that requires multiple stakeholders to be together towards a common mission.
Eric Dodds 6:20
Love it. And so many questions, but we’ll go through so many questions in gallons. Sean, you want to go next?
Sean Halliburton 6:30
Yeah. Sri, I like the cross functional part you threw in there, because a data team can be anywhere from just one to two people. In an analyst role to analysts and data scientists, even you might have a technical program manager, often you don’t. But that team really sits in the confluence of a number of other teams, because you typically have a business stakeholder asking questions that need to be data driven. And the data team is the principal researcher, to get the answers for those questions based on based on any data set, they can get their hands on frequently, just getting a hold of that data is half the challenge. But as that cross functional person, it’s on you to make those connections to be able to do the research and be able to come back with as definitive of an answer as you can.
Eric Dodds 7:31
Love it. Paige?
Paige Berry 7:33
Yeah, I think a lot of what I would say has been covered already, which is awesome, I definitely think about the data team, a data team as related to what the mission of a team doing this work would be. So when I started at Netlify, we had a mission for our data team that was like helping the business to make decisions by providing timely, accurate and actionable insights. And so when I think about people with various various skills, and various, like, talents around working with data coming together with that focus in mind, that is a lot of what I picture a data team being.
Eric Dodds 8:17
Let’s dig down and get one was sort of one click deeper and more practical, and would love each of your perspectives on, you know, data team is, I think, a term that we use to describe the functions that each of you talked about. But I think a lot of times, the actual team name is more specific. And then I would say, even and this is, you know, interesting to think about whether this is a good thing or a bad thing. But the functions that you talked about often can actually be separated out into different parts of the organization, even though they technically sort of functionally roll up to like, you know, delivering data, as, you know, the final product. So, let’s talk through that a little bit. And Paige, do you want to speak to that, because you kind of have like, if I work on a data team, that could mean specifically like, data engineering, data analytics, you know, I mean, they’re their data infrastructure. Like there are lots of things. What are your thoughts on that? Like, are you seeing more centralization happening?
Paige Berry 9:26
Yeah, it kind of depends. I’ve worked at a couple couple of different SaaS companies. And there were different models for each one. So one that I worked at, we had more of a, there’s sort of a central data team, you could say that we’re really mostly focused on a lot of the data engineering and analytics, engineering kind of roles of getting the data from various sources, and putting it creating it and putting it into a central data warehouse and making models that the rest of the company could use to get insights from and then I worked on what was almost like a sub sad day A team that was the product analytics team. And we were focused, mostly just data analysts. And we did some analytics engineering, of course, too, but mostly focused on okay, there’s the data warehouse, there’s all of the data from the company, let’s see what we can use, pull out of there to make insights that help the product or make decisions. And then at, at nullify we were, we were like a central data team, where all of the data came into the Data Warehouse. And we had a data engineer who helped bring those data sources together. And we did analytics engineering and data analysis, also, as part of the team and had had like a hub, which were a few of us who’d kind of worked on cross functionally or cross organization functions. Yeah. And then we had some data analysts who were spokes who, who like, particularly would work with the product or particular sided.
Eric Dodds 10:51
Got it. I think we talked about this, actually, when you were on the show previously, but I think that model is becoming more popular, but as I think the exception as opposed to the rule, when fragmentation seems to be more common. Sri, you’re giving a very affirmative nod there. Do you want to add there?
Srivatsan Sridharan 11:10
Yeah, what Paige said totally resonated with me because a lot of the companies do have it fragmented. And I think there are pros and cons. So at Yelp, for instance, at the data science function rolled into product, and then the rest of the functions like engineering oriented, kinda rolled into engineering. And the benefit of that was the data scientists were sitting in product. And therefore they were able to influence product managers to use data driven insights for decision making. So, you know, when you’re reporting to the kind of same boss, so to speak, things move faster, right. But then there are cons as well, which means that now we have all of these different teams with, you know, different leaders with different priorities, and if you have to align them together, so there’s definitely pros and cons to kind of either or structured.
Eric Dodds 12:01
Yeah. Sean, would love your commentary on that. And I think maybe we can, you can take it a step further and give your opinion on. Is there an environment where one of those works better? Are there certain companies structures or business types where those matter? Because it’s not a right or wrong.
Sean Halliburton 12:20
Right, right. Yeah, we talked a little bit about patterns in our initial warm up chat this morning, right. And two dominant patterns that I see in the industry are, I guess, I will call it chaos and Confederation. And a lot depends on the age of your company, if you’re a younger company, you know, first, second, third stage startup, it’s a lot of chaos, because everybody is shifting roles and tasks so quickly. And whoever ends up in that data, see, it might be an ad hoc decision, and then it might be a very short term one, whereas in an older company, and I’ve worked mostly with older companies, and my experience has been that you have very separate and, and siloed components between the data engineering team, and the database admin team. And then on the other side, you might have the actual analysts and data science teams. And there’s, there’s that wall, you know, some separation can be okay. And it certainly can work. I think the most important thing is trust, and a spirit of empowerment, and wanting to enable self service. I think that’s the most important thing so that if either side says this isn’t really working for us, then both can come together. So you identify a quick resolution that gets everyone back up to that back up that acceleration curve.
Kostas Pardalis 13:59
You all have worked in diverse type of companies, right? We have, like from CNN to like, really extremely fast growing startups. Do you see like a difference and how, like data teams emerge and also how they evolve? Depending on let’s say, the pedigree or like, where the company comes from, like, is it’s a company that exists for like, 50 years, and they have realized that, okay, we have to go and like implement the digital transformation program and like, let’s go to Gartner and hear what like the prophets there have to say about what is right improvement. Right. And on the other hand, you have like companies like Netlify, or like urban food that previously mass like I guess they from day one, they start with some kind of like, data driven nodejs. Right. So do you think that’s, let’s say, this is like something that affects the way that like data teams are formed or like how We are involved inside the company. And we can start with— Oo, I feel very powerful that I can do that. Paige, you go first.
Paige Berry 15:10
Okay, yes. Yeah, so my experience has actually been in it with the Ceph companies are some similarities are not a whole lot different in terms of how long they’ve been around. He’s New Relic has been around longer than Netlify by some amount of time. So. So my understanding, I think of how things have evolved might not be as broad as some other folks, because I’m really interested in hearing what a company that’s been around for 50 years how they put together data teams. So yeah, I do think that there’s the, there is a difference between a company that starts out knowing, okay, we’re have a data team. And we’re, we’re going to put one together early in our career and really early in our lifecycle and really think about data from the beginning versus another company, where I think there was a lot of ideas of like, what we have data as part of what we, you know, what we do with our product, so, and our software engineers are smart, and they can query the data, and they can kind of figure things out for themselves for a while that can actually, you know, that can work. But then there’s a point, I think, where, okay, we actually need to really start codifying and really identifying, okay, what are the actual metrics? And how do we really define them? And let’s, let’s sort of corral the chaos. So that’s what my experience has been, like seeing how do I looked at.
Kostas Pardalis 16:29
Sri, you are next.
Srivatsan Sridharan 16:31
Yeah, I think some of the newer companies definitely have an advantage because the technology has advanced so much, right? Like, today, if if a startup were to bootstrap the data team, they have all the tools at their disposal to get going, which can be harder for a company that’s been around for, let’s say, 50 years, because they probably have already processes and data debt and reconciling all of them, they have to build a system that scales from day one. Whereas in a startup, you don’t have to build a system that scales from day one, and you can organically steal it. So that makes it easier. But I think one of the challenges that smaller startups face, and I’ve definitely seen this, both at Yelp and Robin Hood when they were growing, is you don’t have mature processes for governance for for data, and structure and modeling of data which a larger company might have, because they’ve been around the block for, you know, decades. So definitely, you know, each of these companies face different different challenges based on where they are. But I can definitely, you know, see that in the last like seven to eight years, it’s been much, much more easier to bootstrap a data team than before.
Kostas Pardalis 17:36
Mm-hmm, that’s great. Sean, I left you last for a reason, is because you can tell us about how it feels like for a company that is at least I don’t know how old CNN news, but I’m pretty sure it’s older than—
Sean Halliburton 17:49
Yeah, CNN is 50 years old. It’s still a toddler compared to Nordstrom. But yeah, and again, that’s a great point about governance and existing processes. And that’s often a reason for the the dividing might have between the data team. And the teams that provide the tooling to that data team is just the simple idea of well, we’ve always done it this way. You know, contrast that to a startup where, in a way, not having that history can be a luxury, you can’t say, well, we’ve always done it that way, because we haven’t been doing anything for very long. Again, either can work as long as there’s a willingness to break things in a controlled manner. And in a smart, smart manner, a willingness to experiment and try new things. Versus the versus a bias toward consulting, adding cost as you brought that up. That’s definitely a thing at older companies where there are, you know, your management team might be older and have long standing relationships with other business leaders in the community that come in and out of the consulting scene. And, again, sometimes they feel like they need to bring in that outside third party to come in and shake things up and break the the that history up. We’ve always done it this way. Well, why shouldn’t a data team be able to ask the same question, why shouldn’t we be able to kind of pull the pull the handbrake and say, Well, wait a minute, can we stop for a second and, and evaluates these other opportunities and tools that might be out there, many of them for free.
Kostas Pardalis 19:42
This is great. So we called the very interesting episodes, The Data Stack Show, which I think like I heard like two weeks ago with Ben Stein, so he probably knows him. And from all the analytics, and he said like something super, super interesting. At some point he said that’s, let’s say The data industry like studied, we usually say that like a catalyst for these moves, the cloud data warehouse. But that’s from the technology side of it right? From the organizational side of things, which is also like important because we don’t have that, like technology doesn’t do much anyway. There was like this moment in history, where they that became a data analyst became, let’s say, business partners, we wanted in the business to make as much as possible data driven decisions. And what is very interesting is that like, pioneers in that was gaming companies, like Zynga, for example, right? I found this like, super interesting, because it really, like clicked in me that yeah, like technology is like, we focus a lot on the technology, but we keep forgetting, like the human side of things that also has to be there, right? The reason I’m saying that is because yeah, like we take for granted today that data is, let’s say, important, like for making business decisions. And we probably considered data teams as like business partners, right. But I’d like to ask, through like your experience so far, what was let’s say, the first business means that forced inside the organization to form let’s say, a data team, or like start having like people who they have to work full time, or like working with the data and giving insights. And this is more about like, the startup environment, obviously, like I think things are like a little bit different when we are talking for like established companies. But we’ll talk like soon about that. But I’d love to hear like your experiences and like, what do you have seen there. So, Sri, do you want to start first?
Srivatsan Sridharan 21:49
Yeah, I can go first. I think what I’ve seen in my past experiences, the first set of questions really come from business metrics, where the executive team saying, I want to know kind of how the business is performing. And when your data sets are small, that’s a query that an engineer can fire or a product manager can fire on top of your production database. And I think from an infrastructure, kind of sharing this from an infrastructure point of view, things start to get interesting when the scale increases, and you can no longer query your production database, or you start to have, you know, tons of metrics with different product line is being launched. And I think that really becomes a catalyst for a dedicated function, because you need to extract the data, store the data, organize the data. Until that time, you can probably make do with just squaring your database and getting what you need. So I think it really starts from the executives wanting to know more about the health of the business and predict the future of the business. And then that ends up being the catalyst is kind of what I’ve seen.
Kostas Pardalis 22:50
Make sense. Paige?
Paige Berry 22:52
Yeah, my experience with this would probably be around when I started working more with product analytics at New Relic. And there was a lot of there was a that emphasis around product led growth. And to really make progress in that we had to understand a lot about customer journeys and who was you know, essentially, like, who is clicking what, where, and at what point that they decided to pay us money. And that need for data was a sort of a different, it was different because that was right starting to drive decisions on like, what are we going to build next? What are we going to put our attention towards? How do we make this work this like, get our engine online in that way? So that was that? My experience I my initial experience of saying, hey, we need this data, and we need it to be able to look at it this way. And then we’re going to start actually making decisions based on what we’re seeing here. Yeah, that’s mine.
Kostas Pardalis 23:53
Sean, what’s your experience?
Sean Halliburton 23:56
I have walked onto existing data teams, wherever I’ve gone to singly enough. I have not, not put in time in that startup scene, at least not yet. But I imagine a lot of companies start out just with the simplest data integration between back end database and some paid SaaS platform. And so that executives can try to run their own records best they can without having to die or hire a data person yet, but then inevitably, you start offering more products, you start developing more questions and, you know, additional questions branch off of those questions, depending on what your business is. And those third party platforms only scale so far, because they’re built for the masses, right? And my experience, both as a manager and out As an engineer is that you can submit as many feature requests to those products as you want all day long. But until enough other users and clients ask for the same thing, you’re not going to get it. And if you do, you’re going to pay through the nose for custom development hours. So there is an inflection point where those tools start failing to answer the questions, it takes too long to get answers, or the reporting interface is too rigid and doesn’t work with your proprietary data set. And so that can often be the driver into scaling up a data team from like I said, those one or two initial heads, which probably fall under the business side, not the technology side, to a more cross functional, more empowered team with, you know, better, better tooling underneath. But then again, if you don’t have the right people seeking those answers, and that know the data and know how to work with it, then then the tools really don’t matter.
Kostas Pardalis 26:08
So I have a question because you said that you kind of like experienced so far, like more already established, like data teams out there. Is that like, data teams, I mean, from my understanding, at least, there are like a couple of different roles that you can identify as, like members for data team, right, like BI analysts to database admins, that today from all dark called something else, data engineers, like many, many different roles, right? If you had to identify one role as the backbone of the data team, which one would it be?
Sean Halliburton 26:49
Great question. I think it starts with a great analyst and a great manager. And everything layers on from there. As your analyses get more complex, and you start relying on more data sets, you need to bring in technical reinforcement. So you might add on your first data engineer. As your outputs get more complex, then you start making the leap from analytics to data science. But typically, it always starts with the analyst role, which is probably the right place to start. Because you have to answer those first fundamental questions based on the data sets that are getting you the fastest answers. If you’re lucky, you can add on a technical program manager to help keep all those roles in in sync, as you scale up to say, half a dozen analysts and then three to six data scientists or even more. And if you’re really lucky, you will have a program manager not only for your analytics workstream, before your data platform itself, we were really lucky to have that at CNN, it was a major asset. And it’s all part of that three legged stool between your engineering management’s your program management and your product management. That shouldn’t be that to me, it’s the ideal have a fully scaled data team with in house technical capabilities on both the data engineering and the analytics and science side. You’re taking in requests from your stakeholders on a weekly or BI weekly basis, your product managers meeting with them regularly and is able to speak to all the technical issues that respects your work stream. And your engineering manager, again, is building not only the platform to support all of it, but the people and the technical careers that underpin it all.
Kostas Pardalis 28:59
One extra bonus question for you, Sean, because I kept thinking about that while you were like talking. In these large organizations, what the role of like the traditional IRA, to the data engineers or like the data teams out there?
Sean Halliburton 29:19
Yeah. So I think the answer to that is kind of twofold. It’s a little bit further out to the edges. And by that, I mean, it will likely still own the transactional databases. At the same time that they are moving further out into the new frontiers, the managing the enterprise and message buses, your Kafka clusters, managing your more advanced database platforms, Redshift and Snowflake and similar Cloud Data Warehouse As you called earlier, yes, absolutely, those were great advances. And they were super empowering in the same way that AWS itself has been superpower. In, give less technical team. And I use that term. I mean, every team can be can be a technical team. Well, that’s what you decide you want to be. But those those five data warehouses that really empowered the analysts, teams, that scientists teams to take so much more into their own hands to spin up their own servers to deploy their own code to those servers to be run your batch transforms to run your secondary ETL, similar tasks like that. But yeah, I think traditional, it will always have a place in owning that first very first copy of the data. So your transactional databases, and putting the governance in place so that as you make your own working copies of that transactional data, you do it in a consistent and controllable manner. evangelism and education is a part of that helping to coach other teams on, you know, the best way to create your own production clones. So you configure your own bluegreen deployments in your own data flows, things like that.
Kostas Pardalis 31:23
Yeah. That’s awesome. All right. Paige, let’s go back to the backbone question. So what’s your take on that? Like, what have you seen out there?
Paige Berry 31:35
Yeah, I actually have a pretty similar answer. Maybe partly because I’ve tended to be in the data analyst role more recently. But there’s a there’s a point of, you can have so much data in your data warehouse, and you can have beautiful models in there. But if the if you don’t have that last mile, you don’t have like the way to communicate to the stakeholders and the business folks like what the data is saying that you’re not fully unlocking that value. And so I think that’s one of the things it’s that to me, I really see as a necessary piece of, of this whole data process. Although I also have like an additional thing that I think is interesting. It’s something that I’ve learned from Emily cheerio, who was my manager at Netlify, originally, and she talks about how when you have a kind of like a new data team, and it may be an earlier stage company, and you’re just trying to get like, how do we quickly get value from our data, as she talks about starting with operational analytics piece or the like getting data from, from the source to the tools that the company is using to actually make their decision. So getting data into Salesforce or getting data into whatever marketing tool you’re using, so that folks can actually act on that data? Quickly? And that’s something that could be done by data engineer, analytics engineer. So that’s another piece of like, why there’s, there’s like, it’s a kind of a tie almost early, so it sort of depends on like it yeah, what level your company is at and what you really need first, but for me, it’s whatever is going to get value to the businesses as quickly as possible.
Kostas Pardalis 33:23
Yeah, makes total sense. Sri, what’s your experience?
Srivatsan Sridharan 33:28
It’s a really spicy question. I have to take sides. No, no, I’m just kidding. I think the way I see it, I feel like there are two types of companies, you know, broadly, broadly speaking, there are companies for whom, you know, data is their core product. And then there are companies for whom data adds business value. So if you take the first category, think of any company that has an add to business, right? Data is their main product, because they need to figure out who clicked Who are we targeting to who are recharging, so the clicks and impressions. And so in those organizations, I think, influence is wielded by the people who are making it happen. And so it could be the product leader, it could be the engineering leader could be the architect, because the business, you know, the technology is lagging behind the needs to catch up, right? So the technology ends up driving that if you look at the other category of companies where data adds more value to the business, but it’s not central or critical to the business. I think the most important role is the person who’s sponsoring that, because oftentimes, I feel like data teams have to fight an uphill battle, because the business might refuse to, you know, take their insights. You know, we’re all human. We all have confirmation biases. We all want to think we are right. And so I’ve seen a lot of times where, you know, business leaders will say, I don’t care about the data, we’re still going with this decision. And so for a data team to succeed, the person who can push back on the business and be the champion of truth seeking ends up being kind of the most influential. And that could be and I see that could be a manager in smaller startups that could be one of the early engineers, early analysts in a larger company that could be the director or VP level person. But I think that’s kind of who I see as the the influencers who end up making sure data succeeds at an organization.
Kostas Pardalis 35:23
Yeah, that’s super interesting. That’s a very good insight that applies not just data items, but pretty much everything. We all have to be aware of like a salesperson in our life, like, whatever we do, we need to sell it internally. And I think that’s something that like, everyone who is having like a career in tech, like at some point has to learn, even if you want to be, let’s say, the most into dual contributor that you can be in engineering, you still have somehow to sell your work and like, convince the people around you that there’s value there. And this is like, obviously, like much more important for something like data, because especially like in data, there’s always be, I mean, people think that data is like this binary thing. It’s either like it says you do this or do that. But that’s not the case, data is there like to shelter you with your own biases that you have in your own intuition. So there is always fuzziness, and whatever is delivered there. Anyway, that’s the topic of another episode.
I’d like to hear a little bit more about how someone can build a career in a data team, how someone can get into a data team. And what you have seen, like, first of all, I it would be great to share how you got into that personal each one of you, being you starts by saying like after college, oh, I want to be like if you’re acity question or whatever, on something happened there. So share your personal stories first, and then we can discuss more about like what you see to be happening in the industries? Sri, let’s start with you.
Srivatsan Sridharan 37:03
Yes, for me, I didn’t even know what a data scientist or what a data engineering role meant. When I graduated then. And back then probably there weren’t formal definitions of these roles. Anyway, so I came into the industry wanting to build software. And naturally, you know, I kind of gravitated towards where they were interesting technical problems. And I was just by accident that my manager was like, hey, you know, we have to build this data warehousing ETL solution. Are you interested in this project? I was like, Yeah, that sounds cool. And so it was really by accident, not by something that I planned. I thought the technical problems were interesting. So I jumped into that. And then, you know, one thing led to another, and that’s kind of how it started.
Kostas Pardalis 37:47
Okay, so we have one person who started from being an engineer getting into data. Paige, your turn.
Paige Berry 37:54
Sure. My first jobs were Customer Service at coffee shops. So way that I ended up getting into data is very interesting. I ended up getting a job at the college that I was attending, and the, in the IT department going and helping people with their problems with their computers, because I had a good customer service attitude. So and then I learned how to work on computers. She, like parlayed that into a job where I worked at the foundation for the college and ended up administering their database and working with their donor data. And that’s where I felt totally in love with data databases, data work. So that’s really where it all started. And then I moved into deeper into higher ed and data. And then I made the leap to New Relic and think 2018.
Kostas Pardalis 38:44
Awesome. Sean, your turn.
Sean Halliburton 38:47
Yeah, a little bit of both of those. For me, I came in from outside the tech industry period, I was an English major in college and an editor right out of college. And 20 years ago, I found myself unemployed or funemployed. Too far. I’m in a computer, and so started messing around with just general web development. And so also, I think, unlike a lot of engineers, I came in a little unconventional in that I came in from the front end, I specialized in capturing data, optimizing ad verticals, five pages, landing pages, and form flows. And as I became more responsible for those, I became more responsible for the data flowing through that, you know, I got more and more questions about why am I seeing this in the data. And curiosity really took over from there. I mean, you you have to be curious enough to be proactive enough to be able to get out in front of those questions. And the more I did that, At the more I became dissatisfied with the tools at hand to process that data and be able to answer those questions. And I was lucky enough that at the same time, I was offered the chance to start hiring others to help me do that. And to build out the, again, first the test and learn platform and Nordstrom and then the wider clickstream engineering platform to to be more robust and tailored to our business. And I became addicted to deprecating those big expensive third party off the shelf suites for cloud based proprietary tools and solutions, frequently based on on open source software. And that alone takes so much curiosity to even be even want to do that, let alone to know what questions to ask to lead you to that path. And all along the way that curiosity is what I have looked for when I am hiring others to help me do just that. And I think that’s how a lot of people get into data. At the same time, you know, going back to the very beginning of our of our conversation today how to data teams start off and it starts with one person that is curious and is dissatisfied with the tools that they have to answer those questions. And so they start kind of shaking the hive as I like, and looking for different ways to get at those answers and independently going to it and saying, Hey, can you at least spin me up this so I can try deploying what I’m trying to do? Can you give me slightly elevated privileges, so I can create the relations that I’m trying to create and loads that I’m trying to pull off, there’s this new thing called Airflow, I really want to try things like that, that can often be hard to get traction on either because like I said, you’re you’re in an older company with a certain way of doing things. And that’s not how we do it here, we already have tools for that. Or you’re in a very young company. And it’s matter of prioritization and time. And you might be, you have 40 hours to dedicate to a work has been handed to you. You take an extra five hours a week on the sides who coach yourself up and try those new things, and then demo them to the people you work with, and get traction that way. And to show off what you really couldn’t do if you were given more. And I think that’s where a lot of data teams start and it kind of spirals from there. from one person to two people from one type of role to another type you layer on complexity. Like we said that progression from analytics to data science, as your models progress and mature along with people developing.
Kostas Pardalis 43:04
That’s awesome. And I have to say that like it was like super, super inspiring to hear your, your story. Although it made me feel a little weird. Because I think my journey is a little bit more boring. It’s like, just working with computers since I was like 15 years old, then actually, I mean, just doing the shaping, like for the past, like 30 years.
Sean Halliburton 43:31
Hey, the grass is always greener, right? Yeah. The Commodore 64 that I had when I was 10. And the two weeks I spent with it, and then I put it down.
Kostas Pardalis 43:41
Yeah, yeah, actually, it was kind of funny. Like just like, make like a comment, share. This post, like two days, there were like two very interesting events or topics. So yes, I think yesterday, we have like the events from Nvidia where they announced like the new graphic arts of the series for and I think like a day before or two, they knew the latest version of a game called Monkey Island was released, which for anyone who knows was like the game back in the 90s. That was like, you know, much, much more primitive. And it’s very interesting to see like the two very contradictive contradictive things like happening at the same time and thrilling make me feel like nostalgic. Anyway, Paige, your turn to tell us about the journey you see for a person who wants to have a career in data.
Paige Berry 44:39
Yes, for a person who wants to have a career in data. I definitely echo Shawn, about being curious. It’s a huge part of it. And I think what’s exciting these days is there are a lot of public datasets that are out there and there are there are a lot of like free tools that can be used to it. to like work with those data sets to kind of teach yourself sequel to kind of, to look at stuff data around subject to you’re interested in asking questions, seeing if you can see what the data says. And then even, you know, take a crack at creating some charts and then do the whole storytelling. Like I had this question. This is the public data set I looked at this is some of the charts that came out of that this is what it’s telling me. And having some of those examples are really, really incredibly useful for being able to communicate to others like, this is what I’m interested in something I want to do more of. So I care about those examples are really priceless when it comes to getting into data as a career, I believe.
Kostas Pardalis 45:50
Sri, what about you?
Srivatsan Sridharan 45:52
I would definitely plus one the curiosity. I mean, it’s generally true for any profession. But I think with data, it’s even more so because you are kind of third truth secret, right? I feel like, you know, the data space has become very specialized in the last several years, similar to how you know, you had this change from being a single IT team, you know, having a full stack engineer, mobile engineer, back end engineer, web engineer, and so on. And I think the same thing has happened to data like 10 years ago, you were just a data person. But now you could be a data in front engineer, a data engineer, data analysts, data scientists, and the Li, BI engineer, and so on. So but I think for someone who is just entering the profession, it can be quite daunting to figure out what exactly which exact role that you want to take, because they’re just multitudes of them. But I think what could be helpful is to figure out which direction you want to approach this. So you could approach this from, I like building software, and I want to learn more about data. Or you could approach this from I like solving problems for the business, I like finding answers. And I want to learn more about the tech. And so depending on which where you start, you know, as you spend the years doing this profession, you will naturally figure out what excites you more and what excites you less. And I’ve also seen people changing the professions a lot. I’ve seen data scientists become engineers, I’ve seen engineers become data scientists. And so there’s definitely a lot of mobility in the data space.
Eric Dodds 47:22
Love it. Well, we are actually closing in on our discussion time, we want to leave plenty of time for Q&A. But there’s one more question that I want to hit because I’d love for you to share some insights from your experience, especially with the listeners who are in some sort of managerial role on a data team. That could be really early, that could be on an established team, and it can be difficult to distill this down, but what are some of the top things you would say to someone— and maybe even we can specify to a new manager on a data team. What would you say to them as your top advice? And Paige, we’ll start with you.
Paige Berry 48:12
All right, awesome. This is this is kind of fun, because the there is a conference called coalesce. And my teammate Adam stone, and I did a talk that was recorded at last year’s coalesce December 21, called to all the data managers I’ve loved before and it is an entire talk.
It was great.
Thank you. That is so much fun to put together. And so I think that probably distill that down, it’s a lot of it is, is really being aware of the people side of data that that the challenges that this career can bring for those who are on the data team and for the stakeholders who are leaving data and just really think about how can I think about the people side of this and how people are feeling about this, these challenges? Because it can, we can really dive into the numbers and the facts a lot and kind of forget that there are people behind all of this and even all the business, it’s people on the very bottom, it’s always people. So yeah, thinking about everyone involved as as humans and what they really need, can really help.
Eric Dodds 49:20
So helpful. Sri, how about you?
Srivatsan Sridharan 49:23
Yeah, definitely agree with that. I’ll try to offer an additive or a different perspective. I think it’s really important. If you’re a manager for data team to understand, you know, what is your purpose? And what value are you driving to the organization. And it’s not just you know, you or your team, knowing your purpose, but the entire organization, knowing your purpose, because this is this is something that I’ve seen as a folly of a lot of data teams, and I’ve certainly made those failures in the past is when you are not aligned with let’s say the rest of the executive team or other leaders on what is your role and what that you’re bringing to the company. And the other corollary to that is Is the problems that you’re solving today? How are they going to morph over the next year to two years? Because then that determines your staffing plan, your hiring strategy, the types of specialized skill sets that you want on your team, what your career growth opportunities for members on your team looks like. And so it kind of boils down to those two foundational things like, what is your purpose? Is everybody aligned on that purpose? And how do you see the purpose morphing over the next, you know, one to two to three years?
Eric Dodds 50:26
Yeah, I love that. And I mean, that’s, that’s really good advice for any sort of leader is evangelizing in general, for any team. So love that advice. All right, Sean, wrap up this question, then we can move to Q&A.
Sean Halliburton 50:42
Yeah, I think Paige and Sri already covered it beautifully, I don’t know that there’s a whole lot more to add. All I can think of, is to echo the importance on on relationships, and to make sure that you’re able to get your people into the room as often as you can, I would actually suggest focusing less on the outputs and more on the inputs. That’s, it’s critical. To help coach the organization on thinking about data from the beginning, it’s still too often an afterthought. And it’s, you have to ask that question from the beginning, so that you can form a measurement plan, so that you can know what the questions are, as the product is being developed. As the product is going out, the door is too late. You’re you’re going to lose so much value that way, and the product owner will just revert back to that gut intuition, which which is great up into a point, you know, we’re not trying to replace anyone’s jobs with the data and the data tools that we’re trying to provide. We’re trying to, again, empower them. There’s that word again. And before we can even do that, we have to make sure that we’re a part of the conversation that were there at the beginning of product inception to ask, you know, or to say, love the idea, how are we going to measure it to know when we’ve achieved success the same way you would ask of your own employees? What is your measurement plan for yourself? What is the measurement plan for our team? What are our OKRs so that we can know whether we’ve succeeded or missed our targets?
Eric Dodds 52:31
Yeah, so helpful. All right, well, listen over to Q&A with the last bit of our time. And this is a great question that’s come up on scale, especially when you think about, you know, sort of, like a hyper growth context is when I think about that, when a data team is scaling really fast. What are the main pitfalls? And, you know, I’ll add an augmented question by saying, you know, let’s think about maybe kind of that new manager of a data team as well, who may be sort of experiencing this for the first time. He does, you know, in a completely controlled environment, you know, you can kind of like plan carefully every step of the way, right. When you’re moving really fast, things are harder. So, Sri, do you want to kick us off with that one?
Srivatsan Sridharan 53:22
Yeah, I’ve certainly experienced that both both at Yelp and Robin Hood and, you know, made mistakes along the way. So I can kind of share those learnings here. I think when when teams expand very rapidly, you know, as as leaders or managers, we have a tendency to put order and put structure. But I think building structure when your team is changing every three months is incredibly hard. And so one of the It sounds counterintuitive, but when you’re rapidly stealing, it’s often beneficial to rely on people and delegate as much as you can and not worry about the structure and the process. Because whatever structure and process you come up with two months later, it’s going to be a waste, because your team’s doubled in size, right? So I think in those situations, it’s really important to make sure you’ve gotten the right hires, because hiring is absolutely critical. But assuming you’ve gotten the right hires for the right roles, you know, making sure that they are empowered to run with the process run with the problem the way they want to run with it, even if there might be some chaos that might end up being much much more beneficial than trying to create a lot of structure along the way.
Eric Dodds 54:27
Love it. There’s someone listening who’s saying, yes, throw process out the window? I noticed that what you’re saying that was really unfair, but no, that is really the the ability to, to hand stuff off, I agree is is absolutely huge. And I think that goes back to what all of you said before, which is understanding the purpose and understanding what success looks like what’s your key foundations to be able to do that, Sean?
Sean Halliburton 54:53
Yeah, again, to kind of borrow from what the others have so rightfully called out all Already two related things. One is context switching. So as your team scales up and tries to do more things fast, make sure that you put some kind of restrictor plate on how many of those things are in play at any moments from meetings. And, okay, you’ve hired curious people, that’s awesome. We’re curious people also tend to burn ourselves out. And it’s super easy to do to ourselves, let alone to be driven to burnouts. And I have seen terrible burnout. I’ve been there myself. And, you know, sure, you mentioned recruiting, no manager really wants to be in the recruiting space, because it’s super exhausting, distracting and expensive. It is so much cheaper to keep the people you have to keep good people in a house, and to pump the brakes once in a while when you need to be prepared to push back against leadership, and set more realistic targets. But above all, you got to keep the good people you have.
Eric Dodds 56:06
So helpful. Alright, Paige, I’m going to put a little spin on this, as I add a question to you. Generally, when a team is growing really fast, that’s in response to the organization, organizational demand for data, right? And usually, the mathematical relationship is that the demand outstrips supply, which can lead to burnout designs way. Can you talk about mitigating that on a fast scaling team, you know, because you have the team dynamics, but then the pressure that’s being put on them as really with good intention coming from all around the organization.
Paige Berry 56:42
Yeah, that is a definitely a really interesting challenge. And I’ve been there a couple of times, were really trying to figure out how to not get caught in what we call the service trap. It’s also something I picked up from Emily and putting into place things that can protect the data team from that, which can include really a lot of the ability to be kind of ruthless when it comes to prioritization. So that might mean that for a little while, the data team kind of has to be their own product or project manager, we’ve, I’ve done an exercise that was really helpful on a data team where we actually said, Okay, we’ve got, like 100 requests, every one, pick two that are like the top ones, you really think like, these really need to get done, there’s so much value here, we’re gonna put that on our roadmap, we’re gonna actually decide, like, whatever happens, we work on these. And then as new requests come in, we can say, okay, stakeholders, you and I have said, these are the top two, or what we’re working on for, like, what I’m doing this quarter, this really more important than that. And that helps also get the stakeholders involved and helping with that prioritization. So that’s really, that’s really key, that kind of, of communication. And, yeah, deciding what’s what’s going to be priority, it helps a little bit, what’s a slowing down of the constant like requests of ad hoc. So I’ve done processes like that before that, that have helped with that feeling. Yeah. And then also making sure that the data team still is can protect some time to do proactive insights, that also helps a lot it with keeping that from being feeling like we’re just a bunch of, you know, folks just like hitting the top of the keyboard a ton and sending out reports, the ability to really take time and explore the data and realize that we have some ideas of our own that are actually really helpful and valuable to deliver back to the business can, like sort of, it can really change that relationship that a data team can have a stakeholder.
Eric Dodds 58:46
Yeah, I think the proactive side of that is such a durable way to build trust, right. And sorted out partnership. Well, we are right at time. This has been so wonderful, everyone. Thank you so so much for giving us some time. And yeah, we’ll have you on the show again sometime soon. Thanks.
Paige Berry 59:08
Yay, thank you!
Srivatsan Sridharan 59:08
Thanks for having us on the show.
Paige Berry 59:10
Yeah, this was wonderful.
Sean Halliburton 59:13
See you soon.
Eric Dodds 59:14
Man, Kostas, there’s so much to discuss. I learned so much. I think it’s funny in one of the, you know, we said data teams, but really, it’s it’s very clear. I mean, one of my huge takeaways is that all of these people are very experienced, and considerate and wise managers of, of teams in general. And I think, you know, as expected a lot of the wisdom that they shared with us, you could apply to, you know, almost any, any team structure. I think one of the things that two things that were related that I really took away, were were stream mentioned, evangelizing the mission of the data team. And Paige brought up concept of not getting caught in a service trap, which is what some concept that, you know, someone she worked with a Netlify came up with those two ideas in combination, I think a really important because data teams can often be positioned as order takers, I need this data or I need this insight, or I need this report or this looks broken in or whatever, right. And if you evangelize the the mission, and you create, you know, or sort of cast a vision that’s bigger than just, you know, fulfilling requests, but actually pushing value back into the organization. You know, as we’ve seen from people, we’ve talked to you on the show in the past, that really creates a special dynamic among the team. And so to me, that was just really, really helpful advice in general, but also specifically for data teams.
Kostas Pardalis 1:00:48
That makes sense. What I want to keep from these bombshell To be honest, it’s the background stories of the people. I think that’s like the most, like important thing for me and the most. That’s a what Qlik was, like, really inspired me. And that’s together with a with something that was mentioned that in data themes, there’s a lot of mobility, like you can see people that they are data engineers, then they decided to go and like doing like sub like working as data scientists and go back. And I think that’s like part of the beauty of like working as part of the data team. And if you’re like a curious person, and the personal DOT really likes to learn and do new things. I think being in the locker room is like, an amazing place to be. So that’s what I’m going to teach. And I think we should have like more of these discussions about like, the people lost boots. Yeah. prospects, data. Totally. Technology. So I’m looking forward to talking more about these future.
Eric Dodds 1:01:56
I agree. All right, well, subscribe if you haven’t. And thank you for joining another live stream, and we’ll catch you in the next one.
We hope you enjoyed this episode of The Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me, Eric Dodds, at firstname.lastname@example.org. That’s E-R-I-C at datastackshow.com. The show is brought to you by RudderStack, the CDP for developers. Learn how to build a CDP on your data warehouse at RudderStack.com.