This week on The Data Stack Show, Eric and Kostas host a panel to discuss the technical challenges and details of Reverse ETL. Our panel features: Tejas Manohar of Hightouch, Boris Jabes of Census, and Tridivesh Sarangi of Workato.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 0:05
Welcome to the data stack show. Each week we explore the world of data by talking to the people shaping its future, you’ll learn about new data technology and trends and how data teams and processes are run a top companies. The data SEC show is brought to you by rubber stack one platform for all your customer data pipelines, Learn more at Rutter sack.com. And don’t forget, we’re hiring for full sorts of roles. Welcome to the data sack show, this episode you’re about to hear is actually it was originally recorded as a live stream. And we collected some of the top minds working in the reverse ETL space. And we just wanted to pick their brains about this technology that, you know, has probably been built internally for a long time by companies, but is now being turned into SAS and is doing some interesting things, Costas, I’m really interested to ask this panel about some of the technical challenges of building these things at scale. A lot of times, you know, I think if these were internal builds, you know, maybe like a one to one connection, you’re sort of dealing with, you know, a pretty simple pipeline. But doing this at scale across integrations is hard. So I want to hear about what technical challenges they’re dealing with. How about you?
Kostas Pardalis 1:18
Yeah, I want to ask them, when are we doing finally get like a proper name for this technology? Yeah. This university thing needs to stock like, yes. But still wrong show. Yeah. I’ll, I’ll try to see what they’re thinking about that. And what’s the timeline to get like a bit of name?
Eric Dodds 1:36
Great. Let’s dig. That’s a marketing problem. So we’re going into pretty uncharted territory. But I love it. Alright, let’s dig in. Let’s dude. Welcome to the second data, sexual livestream. This is super fun. We did this once before. And we like to collect some of the best minds in the industry around certain topics, and just pick everyone’s brains. And the topic for this livestream is reverse ETL, which is kind of a new term in the industry. But actually something that people have been doing for a while, which we’ll talk about, and we have some people who I’m just so excited to have on the show names that I’ve followed for a long time. Personally, I know costus house as well. So let’s just do some quick intros to just do you want to start off and give a quick intro?
Tejas Manohar 2:27
Yeah, sure. sube Hey, everyone, I’m t just one of the founders of pilotage. We’re one of the players in the reverse ETL space and data activation, basically helping companies take data from the data warehouse and use it across all the operational processes and SAS proxies in their business. Before finding high touch I was actually early engineering segment. So my experience sort of in the data, vendor space dates back to like 70 years ago before terms like CDP and stuff like that existed and kind of saw the rise of Cloud Data Warehouses there and realize that there was an opportunity to bridge some of the challenges we were solving at segment and, and what was happening in the data warehousing space, with, with companies building a source of truth in the warehouse. So super excited to be on the show today, obviously, for all the companies in here super closely and excited to have a live Coffee Chat.
Eric Dodds 3:11
Great, it’s gonna be great. Alright, Boris, you’re next in the window on Zoom. So take it away. Well, hey, I’m
Boris Jabes 3:16
Boris, I’m the founder of a company called census, we started building what we now call reverse ETL. Back in 2018, when there was no name for this. And we, you know, we’ve always wanted to help companies like, get the most out of their data. And a lot of it tends to be locked away in analytics and warehouses, which is what we are trying to solve. So get that in the hands of sales people, marketing people, support people, finance, people, all those kinds of things. And data pipelines are the way to do that. So yeah, before that, I’ve always been a tool builder. I used to work at Microsoft. And before that, between census and MCSA, I started another company before this that that was kind of tangentially related called Altium. Very cool.
Eric Dodds 3:58
All right, trittye.
Tridivesh Sarangi 4:00
Hi, everyone. I go to PT buy stock, but I need the product lead drip team at ricotta, and coffee. I’m familiar with Ricardo. It’s an enterprise automation platform. We’ve been in business for over eight years. And like is an enterprise automation platform in the customers because we have over 7000 customers that use this but we’re automating various business processes by connecting your cloud on friend stack, but other very interesting pattern and that happens to be reverse ETL as a matter of fact, we released the word automation Index report last year, and did like in any in addition to all the traditional processes like potash and PI on voting procure to pay the report of both the lead management and others reverse to be Kielland was printing out it was on the top 10 and reverse it yellow means very many things to very many people are very excited to join the store and retrieve Use Posterous and Boris To learn more, and I will soon learn more from the questions that we get from our audience. Thanks for adding, Eric.
Boris Jabes 5:09
Yeah, of course.
Eric Dodds 5:10
Tejas Manohar 6:20
Oh, yeah, so I would say across use cases, honestly, it’s pretty exciting, because we’re seeing these cases across pretty much all business teams in an organization. And far, far more use cases in terms of breath. And we imagine when we actually founded the company, so when we started high touch, like, you know, we had a perspective that was heavily influenced by our work at segment, and we thought marketing would be one of the 90% use case of the product. And it turns out sales and marketing and go to market is still probably about like 70% of our use cases in the market. But we’re also serving, you know, finance teams who need rich data in their ERP systems to close the books faster and not, you know, pass around CSVs across the organization, or, or product teams that need information from your analytic stack to be able to power certain personalized customer experiences inside of their applications. But overall, I would say the most exciting part about reverse ETL and data activation as as a whole, when I think about the category, is that we’re oftentimes not just replacing, you know, scripts written by engineers or automation built by engineers, but we’re actually unlocking brand new business use cases, brand new value, and brand new growth and revenue opportunities for companies using the wealth of data that they have already in their data warehouse. And that’s really what I think has caught the attention of the market and excited companies to jump right in and see, what can they do with the resources and data they already have to drive growth? Great for us?
Boris Jabes 7:41
Yeah, I think the, the breadth of scenarios has always been the kind of most exciting thing here, you know, when we envision the platform, we we kind of thought about it as something very horizontal, you know, I tend to think about the fact that the way people wire data together shouldn’t be piecemeal. And they should think about, where can they centralize as much data as possible and get a source of truth, and then federate that to as many kind of ends of the organization as possible. And to me, that’s the story of, that’s actually the goal of SAS going back 20 years, which is to empower every individual in a company. And so whether that’s your finance or sales you want, you want the right data, you want data you can trust. And you want that in the operational tool where you do your work, right, rather than having to open up five tabs. And so the this idea, so what I’ve seen over the last few years working on on this is that analytics, by virtue of a lot of other kind of trends and behaviors on the data team has become host to the best data in the company. Right? The most complete data, the most trustworthy data, it’s the data that I mean, ultimately, you’re going to use to report to Wall Street to some degree, right? And so like has the most level of scrutiny probably. Exactly, exactly. And so the the the ability to operationalize that data, right to take that data and make it kind of available to every part of the company has been super exciting and continues to grow. And so funnily enough, we didn’t start with some kind of like marketing bent back in 2018. We actually started with product like growth, and just kind of thinking about, yeah, when you have software as your core kind of asset, the way you take it to the market is just different. Right? And I don’t know, it was a personal frustration back then about salespeople not knowing what users are doing in the product. And I think Funny enough, like I think segment had done a great job of connecting marketers to the engineering side, but sales was like left left behind? So our early scenarios, were all on the sales side. And then that has since expanded. Yeah. And that has since expanded to literally, I don’t know, you can’t even. I don’t know if I could summarize it in any kind of set, right? It’s like from support to finance, product marketing. It’s like everyone kind of wants to depend on on this data, and data organizations want to get more out of the asset that they’ve invested in. Right. And so that, to me is the is the exciting story. Like, I’m a tool builder, right? And so you’re trying to make someone else a more amazing version of themselves. And like data teams have a lot to offer. And they was locked away in charts. And, you know, the idea was, like, let’s get this into operational tools. Very cool. Alright, trittye.
Unknown Speaker 10:31
I mean, again, like what Tejas and, Boris summarized, catches a lot. I’ll just add to that in a different way. It’s essentially the trends we see, like over the last several decades, decades, like ETL had been a way to collect our data from various sources, like business application business systems, and move it into a single repository open source all in and create a source of trillion that you can rely on, right? But and there’s always been like, if you ask any company, any any individual right field, how do you make decisions, so we are data driven. And the the nature of data driven all the way to solve data, being data driven, is was always has always been oriented to build degrees still is, like put to like timelog, the parent cup of your, this phenomenal repository of data and run visualizations, dashboards, repos, and that makes it data driven. And nothing could be further from the truth. Just looking at the data, you know, everyone has their own into the patient. And what that is, and what’s changing now is, is people want access to that data without having to go to you know, family to that other honor, news, validity says false market Marketo, whether it be for like product analytics, amplitude Mixpanel. And it depends. So one is that people want access to insight from read everything rather than having to relearn or go to another tool to download those reports. So that’s one, the second trend that we’ve seen is access to information more in real time, rather than only to read both or, you know, duty to both sides. As things happen, when a customer churn store changes, people want to take action, like the CSM wants to reach out to them, and say, Hey, what’s happening, you know, if there’s a drop in activity, really is want to reach out, you know, the GTM team, the teams want to reach out, if there’s a change in upset or process goals, you want to trigger off campaigns. So those kinds of automations being in real time may get more event driven. And that’s driving some of these patterns around how you will become truly data driven, rather than just looking at visualization. Right? So those are, those are some and like, you know, and if you apply those trends, like one business function would not want to act on these things in real pyre. It’s not just for, you know, the GTM teams, it’s also finance. And also the other, I’ll get the third most important, if not the most, maybe one another important ones, is the elevation of a data warehouse, or data lake should be at the same level as any other business application. It’s no longer the black box, you know, sure, yeah. That you need to put the prison on the pad, like what it looks like. It’s been an it’s being elevated to another business application as important or some clarity, more important than CRLA. Right. So that is the other trend. So how do you make that really accessible in real time, across all business functions to make them truly data driven? than rely on? Like, what has been produced in the business intelligence? And that’s what’s driving these trends? From what we see with our customers? Yeah, that’s so
Eric Dodds 13:51
interesting. I mean, data driven is such a loaded term, right? Like it’s, it’s almost become hollow, because it’s used so much in, you know, marketing terminology. And for No, no, no decades now. Right. For decades,
Boris Jabes 14:05
I probably goes back to the information superhighway, right. And like,
Eric Dodds 14:08
I mean, maybe this isn’t okay, but I’m going to take a little bit of a dig at, like the big consultancies, right? Because it’s like digital transformation. You know, it’s like, man, the billions of dollars that people have made, like just trying to connect some pipes to like help companies become more data driven. Yeah, but digital transformation is such a catch. All right, it is.
Tejas Manohar 14:27
And data driven is just as much.
Boris Jabes 14:30
That’s true. I suppose you’re right. But digital digital is the new digital trading bigger, like that one feels even more all encompassing, because it’s like, it means computers, right. It just means, like, computers. Like I think that that? Yeah. I mean, it could be sheets of paper. So I guess Yeah, it could be even bigger, I suppose. No. Yeah,
Eric Dodds 14:48
I agree. I think maybe. I mean, I think at least one big underpinning of digital transformation is sort of the move from on prem, on prem to cloud, you know, which is certainly non trivial, especially in the enterprise. But actually, on that note, what I’d love to do here is I’d love to get a little bit technical. So, you know, I remember when, you know, there were companies sending data out of redshift into, like, SAS applications, right? Like, a good while ago, the idea of sort of getting data out of a warehouse and into some sort of SaaS application isn’t, isn’t new, right. Like, and I think we all would agree that like, I mean, integration has been going on exactly where do I say the data integration has been a thing people been doing for decades? Sure. Yeah. So. So it’s not like reverse ETL is like, you know, someone invented a completely novel way of like sending data from point A to point B, right, like it’s been happening. But it’s painful, right. And so like, we’re building SAS around that, which is super exciting. But there are still a lot of companies struggling with the pain of like, trying to get the data out of the warehouse and into into SAS applications. And what I’d love to know is, because I think a lot of our listeners are, you know, either data engineers or on data teams who have experienced the pain of trying to build by themselves, experienced the want of not having the budget, or the bandwidth to build that themselves, or grew up in an age where, like, the SAS just wasn’t available to make that easy to them, right. And so like, that’s sort of just painful, right, we’re gonna deal with it, and like, downstream teams are gonna be annoyed. But if anyone’s built anything like that, you know, generally, I think it would be ad hoc inside of a company. So you sort of have like a bespoke pipeline, probably like one to one or like one to a few. But you’re building like, really robust pipelines that are taking, you know, sort of tables or data in the warehouse, and then you’re fanning them out to like a huge number of tools. And you’re doing this at scale, in a cloud SAS format, right? And so, I’m genuinely curious, like, how, like, what are the problems that you’re facing trying to do that, especially, you know, if anyone’s done this, you know, sort of ad hoc or bespoke, like an accompany, like, help them understand what does it take to do this at scale?
Boris Jabes 17:08
I mean, there’s a bunch of things. Oh, definitely, there’s a bunch of things you have to factor in, if you’re going to do this yourself, right. There’s, and I think we’re all gonna probably talk about some similar things here. But the first thing you got to deal with is errors, right? Just like things fail way more than you might predict, right? So the great fallacy of API’s going back again, 20-30 years is like, oh they just work! Nope, they don’t just work. So, so things fail. And building in recovery is significantly more difficult, I would say, than simply writing the code to sync data, right? So that’s one. Two is like scale, right? So size. So dealing with 10 rows is totally different than 1000. Totally different than a million, different than a billion, right? And people need to sync large amounts of data. Our users, like our companies have, like on the order of 500 plus million users, right? So, you have to be able to do this at scale and with destinations that don’t handle scale particularly well. I mean, it’s bad, right? Some are really, really good. Do you know which product is unbelievably good at scale? Facebook, Facebook will happily eat like, hundreds of millions of records in like, a snap, right? But Marketo, Marketo, other end of the spectrum, other end of the spectrum? I like to joke about Facebook, because it’s like, you don’t think about it. But it’s like, it’s the reason so fast as like they already have all the data. So they’re just going check. They’re just going, yep, we know who you’re talking about. Anyway, other podcasts another day? Yeah. So So scale is you know, how you stitch data incremental, like into a system, how you do it in the right kind of order, with minimizing API usage, all these kinds of things is like probably the second thing that if you’re going to do this yourself, you have to think about third is probably like monitoring all this right, things break, your stuff will break. Now, I think there things have a lot have improved in our market broadly, like you can use your kind of orchestration tools that have good summer learning for you. But you have to be monitorable. Right? And that’s really not the amount of work. It’s the same reason engineers don’t tend to build New Relic or data dog themselves. Right? Yep. That in itself is is is expensive. And so that’s a huge part of our software as well, right? You want these things to be alertable monitorable. And then last that I’d say is, I think, I don’t know if you all have seen something different, but like most internal versions of this are, you know, not manageable by anyone other than the person who wrote it. Whereas the whole point, right, pretty, you talked about this, right? You talked about the democratization of data and analytics that people want to able to access these things. And if you’re going to build this yourself, are you going to build the UI to make it easily mappable and that people can modify these things without having to call you right like that? is probably the the all the things you’d have to build today. do this well, yourself.
Eric Dodds 20:01
Yep. Love it. Okay, I’m going to slightly modify the question as I pass it on to TAs and then to just sorry. And then, and then trittye. But okay, so here’s the slight modification. How do you decide what you manage? And then what you hand off to the user? And or like, where are the compromises there? Right? So if you think about incremental syncs, like, Are there decisions that you need to make on behalf of the user? Or are there like use cases where you like make that decision for them? Like those are actually fairly challenging when you think about data at scale? So yeah, I would just love to hear your perspective amount?
Tejas Manohar 20:37
Yeah, it’s a great question. So one thing that I think has been really powerful about the tools in their risky ecosystem is giving the users a lot of flexibility, but also a lot of guardrails at the same time. So one thing that we handle out of the box, where it’s also tapped on as well as like diffing. So I think typically, when companies build a script like this in house, we’ll just kind of build a loop over the data in the warehouse and call it API to go updated or inserted into destination. And it remains pretty basic. And then a challenge comes up the destination API can only accept data at a certain rate. And you need to, you know, only send updated data, but you don’t have a clear, like updated timestamp and say, your data warehouse or something like that. So one thing that we’ve handled out of the box is, it’s different for our customers where, you know, high touch can actually, you know, automatically only send changes to some of these downstream to all these downstream destinations, instead of sending all the data every time. And with diffing, there’s a ton of once it’s so you know, we support multiple mechanisms for things. So one those support is like dipping inside of the customers warehouse where data that’s being being synced over is actually written back to the warehouse and enjoined against the process of syncing to downstream data. Oh, interesting. Okay. But for not all, not for all data warehouses is the best approach. I mean, like certain databases that our customers connect with, right and back to it isn’t as favorable as a cloud data warehouse like a Google BigQuery or snowflake, where storage is separate from attributes. If you’re thinking about like a redshift, this may not be the most Gabrovo approach or even more of a, you know, transactional or production database, like an elastic search or even like a production Postgres, you know, that might not be something that a customer is okay with. We also do support other mechanisms of power of a stiffening, like writing the data back to like a customer’s s3 bucket, for example. And even depending on the data warehouse, we use, we support like, even more options, like for example, leveraging timestamp, partition keys, and something like a redshift, Google BigQuery. To automatically do more intelligent, faster diffing for stuff like event for in these cases. So, one thing I would say is like, with building reverse ETL platforms, we have a lot of features kind of built out of the box, where for companies don’t have to implement this stuff, but then still allow them to kind of see more and dial in and control how it works, if they need to, for their use case. So I think the same defaults with a lot of customizability is a general purse that we’ve been taking to building our software and one that companies have really appreciated versus, say other players in the market with like, CDP’s and whatnot.
Eric Dodds 23:01
Yeah, super interesting. Okay, treaty, you’re gonna have the last word here. But you can only answer this in two sentences to just because I just have a quick diversion here. I said it wasn’t gonna get technical. But of course, I’m trying to get more philosophical
Tridivesh Sarangi 23:13
Eric Dodds 23:16
though. But here’s the question, though, does the user care, right? And the reason I asked that is because, like, how do you build your user? This is gonna be a big rabbit hole. But like, those are really complicated, like things that you’re, you’re discussing? Right? Like diffing? across. Okay, like, warehouse, right, like general term, general tool, when we talk about diffing? Like, very specific, right, and like very specific product problems. And genuinely, I’m interested in like, our, do your users like care about that, you know, sort of like the tuning question, right? Like, yeah, you can get software running, but like tuning, it’s like a different skill set.
Tridivesh Sarangi 23:55
We talked about the various use cases like the like, who’s in a little bit like, the traditional ETL, the team was always centralized. Right? So we talked about this discussion around jurisdiction, like who’s born in the bear, if I can call them reverse ETL? pipelines, right? Like if if the GTM team that owns it, like do they really care about the control and the you know, the extensibility and such probably not as much, right. But there are other teams like maybe the product or data engineering teams that need a lot more control and flexibility. So the character is depending on what they want is reverse ETL pipeline, a service, the needs will be different. The personas that we’re using in the building, these are different and they will require like you have some places like the Allen boss, thinking things more just fine. Some paid like, you know, it’s not just for diversity and like the example of Salesforce Marketo integration, the out of box integration, just do fire. But then there are some cases where you need to do some cases you need to Unlike aggregate, do some lookups on third party application or like depending on the nature of transformation where you need a lot more control, right? So that those are plays engineer need to also create some reusable components that you can apply and standardize it for us multiple pipelines. And in those cases, it will care about more control and flexibility. But I just wanted to add Rename. I think, Boris touched upon a few things that are very important when you ask this question, what should the product do? And what should the user do? Yeah. And do don’t forget the era we live in, right? The like, at least, like how we believe that philosophy, the product has to do more, so the users can get done more? What does that mean, worrying? That that’s not just a soundbite? What does that mean? One is like, looking at like literally reverse ETL or any form of data movement, the number of sources, the product can connect to, like three node connectivity, right? Bought from a source, right? Hot like they just brought up, like how snowflake works is very different from our big carriers. Right? Snowflake may bring internet capability on streaming Big Query not so likes, right? So the product has particular new things on the destination side, Salesforce offers like bulk API is to ingest much faster rate limiters, 10s of billions of rows, it does market, not as much NetSuite not as much, right. So the product has to do more to do that buffering, the queuing and size. So the user doesn’t have to worry about those things. So that’s one very important part, the ability to breadth of connectors like on both sides, the source system source databases, and the destination. The other point that Morrison brought out like with any pipeline and bad things happen, errors hacker, and if the product is doesn’t provide the ability to sell, see recover, and in the free world monitoring tools to troubleshoot, even not troubleshoot, like AutoCorrect. In some ways, it puts a lot of burden on the developer, right? And then it requires specialist to command. So those are the things that product need to do more of what should the user focus on is more the business larger? Like what is the outcome that we want to drive, right and like, we need to, I need to move these set of reports, when for an upstairs campaign, I need to look at this data in this data table, like monitor for upskilled, spoons, 175, and whatnot. And then, you know, take that list out and move it into a marketing campaign, they should focus just on the business logic and how quickly they can configure the second part, the more they’re able to get to the you know, like business processes change dynamically, every week, every month, the ability to get through rate. So it should not be brittle, right, it should not be brittle. The ability to iterate and be agile and and moderate is also something with products, it’s a boat. So I’ll put it this way. So and all the products that we are related to represent here and the one one letter coming up, they have to have parity in terms of experience with with one with end users are using what I mean by that is it’s more configuration driven than click driven, report driven, right? So like Salesforce Marketo, you can do most of the things to collect rather than have to read any books, but also provided the extensibility marry, you need thought in every little bit some Python scripting, some pre existing scripts that you may want to use, you’re able to pull that in, so it doesn’t put you in a box. So that those are things that drive adoption for solutions like these.
Kostas Pardalis 28:42
Guys, I have a question that is related with something that was mentioned a little bit earlier, that’s nothing is like extremely new, right? Like it’s not like the first time that’s the market out there had like to move data from point A to point B and even like push the data back to the downstream applications. But I won’t like to ask all three of you about like two specific cases of products. And I will start with the just because he’s coming from segments. So the two products that I want to ask you about one is new corrections. And the other one is persona, right. And the reason that I’m focusing on these two is because these two products are not that. I mean, they were not created that back in the past, like bumper to when you started it right. But why we don’t hear about them, like in a way why they didn’t succeed in creating the category or leading the category, let’s say so. Does your first then I’ll ask the rest of the
Tejas Manohar 29:48
goalie I can kick it off. So this is super tough question because I was actually one of the first engineers working on second percent of this with my co founder and CTO at at high touch step. First, I’ll take a look or actions on See Looker actions had a pretty brilliant idea, which was, you know, one of the first, I think one of the first offerings to market that they started evangelizing this idea of reverse ETL and which was, you know, we’re analyzing stuff in looker. And there should be a way to take these insights and put it into like the other tools that the rest of the business teams look at. And not just have the business teams have to look at a looker dashboard or Looker report every single time should be able to use that information more ly, that is really the concept between reverse ETL behind reverse ETL. Today, I would say there’s a couple reasons it didn’t really pan out. I mean, one honestly, I would say it’s just like resource allocation. Like if you take a look at the looker action destinations, you just have like a lot of limitations. Like I think that braze destination, for example, can only handle like results of 100 rows, they don’t really do diffing in their infrastructure, they don’t really have much visibility or observability. That kind of sync mapping interface that customers expect for like a more modern reverse ETL platform is just like not there in Looker action. So I think really the reason it didn’t take off is because activation data activation is just a separate technical problem in a separate technical space than data analytics. And I don’t think that like the team working on Looker actions really treated as such and invested in invested in building locker actions to the same product perfection in degree and thoughtfulness that kind of best of breed solutions have come out to the market with like hydrogen census, for example. So that’s the reason I think we’ll crash is gonna pan out. There’s also some parts of that about it, which is that tons of people don’t use Looker and want to tap into data and their data warehouse. But actually, even tons of our customers do use looker. And the real reason, you know it didn’t get very far was just product quality and product design in the end of the day. When I think about segment personas, it’s actually different segment personas for anyone who doesn’t know basically says, Okay, you’re tracking all this event data into segment, it’s being forwarded to all these different downstream tools. But we want to provide marketing teams and grow teams and teams like this a central place inside of the segment product where all the user data is aggregated into these profiles that you can then you know, build upon in a WYSIWYG ways, add some computed traits, like number of orders in the last month to one or LTV and then also build audiences on top of these profiles on symptomatic differentials. So really, if you think about it, I segment personas was almost builds is building its own source of truth off segment data within the segment products. And I think what the market has really realized is that the source of truth is not going to be in any sort of proprietary vendor or any sort of SaaS application or follow any sort of spec of what a user should look like, or what an event should look like, or what a shopping cart should look like. It’s gonna be in the data warehouse where companies are able to get all the data into it via, you know, numerous different ETL vendors, where there’s a standard that all software is kind of integrating on top of, transform it freely using software like dbt, for example, in the ELT stack, and then, once they know what a customer 360 view kind of looks like in the data warehouse, sync it out to all the different downstream destinations. Honestly, I’d say the reason segment personas primarily didn’t end up I’d say is just because it was built on the wrong source of truth, right, it was built directly on top of segment as a source of truth with the warehouse was kind of like a side afterthought. Whereas what I think, has really become clear in the last five to seven years is that companies want to use a data warehouse as a source of sheets. That’s where all the data will be. And that’s where the best data will be kind of as forced mentioned earlier. And that’s really the trend that reverse ETL AND gate activation is riding on.
Kostas Pardalis 33:28
Mm hmm. That’s very interesting. And actually, based on your experience, I’d segment because this is something of like I’ve been thinking like from time to time. Do you think that the way that personas like were implemented based on the the single source of truth that was like the segment itself was also like a result of, let’s say, diamond, like when Sam actually started as a combined hire really?
Tejas Manohar 33:49
Yeah, I I say this time and time again. So I think, you know, the approach that CDP solutions like segment took, you know, back when I worked there, or seven, eight years ago, when solutions were started to be designed is not was not wrong. For the time, he looked at data warehouse music at the time, I mean, companies like snowflake, just add like a, you know, less than 100 customers when I joined segment, honestly, and unless you were in the enterprise, you weren’t really heavily using the data warehouse, bi culture and solutions, like Looker were just popping up. If you went to a company and said, Hey, we’re building reverse ETL we’re gonna allow you to take data from your data warehouse and feed it out into the SAS tools, the salt problem may you have on marketing or problem be having sales. Technically, that works like the software would work just as well then as it did today, in a lot of sense, as a technical solution. But when you think of the fit, like the product market fit for companies, they just didn’t have the data in the warehouse in the first place. They weren’t building, you know, the kind of models of what it means to be a customer. How much are they paying us? Are they a high value low values are just, you know, the all the prerequisite steps were done yet. So it just didn’t make sense for that to be the way that companies solve data activation problems all the way back 567 years ago. So don’t think that way. acdp is a perceptron was incorrect at all. I think it’s just a different approach for different time. And now that companies have made this massive investment in data warehousing in the modern data stack, everyone’s looking for how can I drive more value from it? How can I use all the data I have, and all the data models I’ve built to drive growth and reverse ETL? And data activation is really the the answer to that that makes sense for businesses at this time?
Boris Jabes 35:23
Mm. There’s, I could not agree more with that, like most of these things end up with good decisions in their in their context, right? Even I would say, since you talked about Nothing is new under the sun, right? Long before any of those products, like people were integrating data, and it made sense to do it, you know, from A to B without a warehouse like that, it wouldn’t have you would have been an incorrect decision to kind of design with a warehouse bias, right? Like we, we did something in 2018. That was like, almost weird for its time, which is like we we put all of our products, capabilities inside the warehouse, right? Which was unheard of for a SaaS product at the time. So it’s like, you can cut the cord census and all our data is actually sitting in your warehouse, because I felt like, you know, there’s a secular trend towards owning your data, which to just kind of mentioned, I think those are much larger trends than even just a data stack trend. Right? Yep. You, you’re from Greece, like, Europe has led the way. But there’s a general trend towards owning your data, making sure it’s not locked away in a proprietary platform. Right. And the data warehouses have just been perfect piece of infrastructure for that. And then, if you think about I tend to think about the humans involves a lot, as opposed to just the tech. Right. I know, it’s weird for technologists, but segment was a brilliant bridge between engineering and marketing. And right, there’s this is that would you understand,
Tejas Manohar 36:47
exactly like product engineering, in particular, writing is the big differentiation. Right? Right.
Boris Jabes 36:52
And it, you know, when we, when we started, it was we were not trying to be a bridge between product engineering and market, we’re trying to be a bridge between data team writ large. And at the bottom, we started in sales, but eventually all all teams, but it was really about putting the data team at the center. Right. And, you know, Looker, of course, cared about the data team, obviously. But it cared primarily about this, like batch analysis, you know, explore some reports about what happened last quarter. And this idea of taking the data team and making them a central pillar of the company, that they’re operationalizing their work, that they are driving in the truest sense, Eric, right, like driving the business. That is a different relationship. And if you tried to build that relationship 10 years ago, or seven years ago, the data team was too, too small, it didn’t have enough tools, it wouldn’t have had the kind of the buy in from the C suite to own this part of the the company. well in the data wasn’t wasn’t actually centralized, Really. I mean, yeah, but all these things are gonna build on each other. Right. But But I think it’s not just the data is centralized. Now. It’s that data teams, and I think CEOs around the world are realizing like, I need to give this team more, more influence in my company, because good things happen when I do. Right. So you needed a new bridge between them and everybody else, right. So that’s kind of like why we have you kind of we talk a lot about the word analytics, because it’s like, that’s kind of the lingua franca of data teams is the word analytics as like, let’s operationalize that. Right. So
Tejas Manohar 38:25
yeah, I think outside of the the core data team as well, just the, you know, the data enabled personas in organizations just have a much more powerful toolset that the end they did 10 years ago, like, obviously, you know, marketing operations, analysts, marketing analysts, sales analysts, those, those roles existed 10 years ago, as well. But if you look at the tools they were using, they’re using Google Analytics on mature Google’s, you know, Excel, like tools like that Salesforce reports, they didn’t have the power of the data warehouse, they weren’t leveraging bi, didn’t have knowledge or even access to SQL queries or access to a place to run SQL queries. And that has drastically changed to to say, like, you know, it’s a lot easier for for any business user to find someone nearby them, like sits nearby them in the office that I can write sequel or that can use a BI tool, then someone who can code and that wasn’t really true. 10 years ago,
Boris Jabes 39:15
I’d say, because I was I was talking to someone the other day, how many people do you think have the title? have SQL in their skills, but no other programming languages? On LinkedIn? How’s it
Eric Dodds 39:28
going? Great. That’s a great question.
Boris Jabes 39:30
That’s a good question. I would assume you have to listen. I’m not gonna tell you but if you listen to my next public podcast, you will discover it, like as a percentage are like no, no, just a number of humans number of humans number on LinkedIn who stayed SQL as a skill, but not what the rest of us here would probably call Pergamino. Number of humans. This is great. This is like wasting wagers, you know, like the Whitson, which is a fun game.
Eric Dodds 39:52
It’s a great game. Whatever the question was for you, costus I’m, I’m Yeah, I didn’t know I would assume this. Give us a number. That’s way more fun if you try to give a number,
Kostas Pardalis 40:03
give a number. I don’t know, like, I mean, a number failing here,
Tridivesh Sarangi 40:12
at least given everything I just said to follow LinkedIn users so that LinkedIn has like, like 700 million users like 14 million or so.
Boris Jabes 40:19
Well, nice, high 30. That’s high. Like,
Eric Dodds 40:21
I think that’s it. I was gonna say like,
Tejas Manohar 40:24
I don’t know anything, though.
Eric Dodds 40:27
Two to 4 million was my guess. But that’s I think the treaty here, man, like
Tejas Manohar 40:31
Boris Jabes 40:33
So it does, I think I would have guessed seems you, but it’s on the order of 5 million. Wow. That’s great. Pretty great. Right.
Tejas Manohar 40:40
Tridivesh Sarangi 40:43
I love this back to your question, the need for Ligonier. It’s like, the, this pattern has existed long before it started getting branded as reverse ETL. The difference is how it has been fulfilled in the past, right? It hasn’t been fulfilled with CSV exposed, right? There was a CSV exports, imports and things like that. Right. And, and who’s able to do that, who was able to do that in the past is like the centralized data teams or people who were very incompetent with databases and search, right? Not for reason, not only for the reason of knowing SQL, just from a compliance and security standpoint, you didn’t have access to these systems of report. Right. So what else seems like any good statement, the looker, how to categorize them in any product that doesn’t have anything with data other than, like, visualization, but segment was a good example. But what has changed is the demand for these requests, it’s coming from, you know, already talking about what sort of use cases, yeah, we give segments out for like five or 10% of use cases for GDM teams, there’s a whole large number of unsolved cases, that gets unmet by any tool, and that’s why these products exist, right. So there’s a need for ownership of these processes outside of the bigger team as well. And Boris and Viviana pages, you can speak to this, by your mind centers will be different from the traditional data teams, right, the traditional gathering sectors of the traditional ETL products. Yeah,
Boris Jabes 42:27
I think I think your point, both of you are saying like, there’s this democratization occurring, right of a skill of skill. And listen, I, you talk about buying, right, Trudy like, I think the journey of SAS for 20 years now, is this is this empowerment of individuals and teams that are not, you’re talking about data teams, it used to be that all your software was bought by the CIO, right? Like period and deployed by your CIO and in the office, in a physical office summer from that, right. And people used to call it, Shadow IT and all these things, but broadly speaking, it’s about having more choice, more autonomy, and different sets of teams are able to make decisions about what tools they want to use. And I, you know, this is where, you know, like, I’ve been at this for technically a decade, if you factor in my previous company, which was all about kind of democratizing access to SAS, I think this is, this is the journey we’re still on, as an industry is, is letting individuals and teams make decisions about software, while like where and they get and using it to the best of their ability, in other words, with the best data from the trusted source, right. But our job has to be to create the right, you know, let’s call it guardrails and availability of that data and not to prevent individual teams, whether that’s a sales team, or a content marketing team doesn’t matter, to make choices about what tools they want to use, right. And, you know, the analogy I like to use about this, now I’m gonna really frame myself as a as a as a child of the 80s. But like, video games just work like this, too. So in the 80s, like, video games were not purchased by the children who played them. They were selected and purchased by parents. And, and therefore, they were marketed to parents, things that people don’t remember this, but they were marketed to like mom and dad as like, say fun games. And that will change in the 90s and into the 2000s where we now you know, have you know, more violent games more sports, like games that are more for the you know, let’s say the user, but the reason we could do it, but the reason we could do that, right, there are these necessary pieces that had to come into existence, like ESRB ratings and and like app stores and, and, and controls from the Game Maker Studio so that you couldn’t just install whatever game on your console. And so that is where the building blocks and so SAS is to me a similar journey just for the you know, the worker in the IT world. And so, so yeah, trittye To answer your question, like, yeah, the buyer is not going to be this centralized, massive team. The data team just has to have the right visibility observability in our platforms, so that they can let everybody else you know, kind of select and and do what they want. That’s kind of
Tridivesh Sarangi 45:10
an analogy. You reduce, like, even though the kid expertise, it’s still quite feelings of some supervision. Right.
Boris Jabes 45:18
Exactly. Exactly. Well, thank trust. Yeah, exactly. It’s trust, but verify that so governance, Will. I think we’re at the early days of that, but I think that that’s going to become key to all of our platforms is to make sure there’s, there’s a reasonable governance.
Tejas Manohar 45:30
Yeah, I think something really interesting here is something that Trudy actually brought up earlier, which is that when he said, what does the product have to do versus what does the user have to do? I think about it a little differently, like, almost an extension of that where the product is also the infrastructure that the company is building, right? So it’s not just what does the Hightouch your product have to do, but it’s, what does the data warehouse have to do? What is done upstream versus what does the user have to do. I think that balance, like striking that balance in the application, it’s like, you know, the winning formula to enabling business teams to be able to leverage this data. So as much as possible, if reverse ETL and data activation platforms can, you know, tap into tools like in observability space or, you know, leverage models from the transformation space or, or do a lot of things outside of their product that taps into the overall infrastructure, that kind of the, the technical teams that data teams are putting forth in an organization, then that makes it a lot easier for business teams to come in and solve these cases in a self service capacity without actually building more product features in reverse ETL tools themselves. So I think that’s a really interesting trend that we’re seeing. One thing is with the CDP players, everything was kind of in a proprietary ecosystem, where, let’s say you wanted a data transformation feature, CDP had to build a data transformation feature in it, let’s say you wanted observability on on data ingestion, a CDP how to build, you know, observability into its platform, with reverse ETL. In the Data Warehouse, sort of first approach, these can be solved by the ecosystem of players that all interrupt and build on top of the data warehouse instead of necessarily one vendor. And a lot of these problems that could be solved by the product can now be solved by the kind of technical infrastructure and analytics infrastructure that a company has in place, which is just super powerful for business users don’t have to think about it. That’s the Canadian,
Tridivesh Sarangi 47:10
he just avocation on that generic. We’ve touched upon this in like, it seems to be common theme, the diversity of somehow has to be tied to the data warehouse. datalake is a store? Sure. I think it’s broader than that it can go beyond like any centralized repository of data, where they, you know, and so, okay, I just wanted to pop some large asleep, or very quickly, process your thoughts, like, when we said reverse ETL, like, give because ETL has been traditionally tie in with the data warehouse, it may indicate that every most with Yellowlees, there has to happen, the data warehouse, can read NDA, right. It’s like there’s a balance, right? It can be like a customer data harm or employee data hundreds
Boris Jabes 47:56
of upon Yeah, yeah. And I think all of our products support lots of different scenarios. Right. But I think the goal is, I think we, if as an industry, we end up with a variety of sources and a variety of destinations, and no central cleaning, and, and duplication and kind of unification in the core somewhere. We’re going to make great companies who make lots of money, and but we will not actually have moved the kind of the industry forward. And I think this is to me, the where we need to land in the end, right is that you have remember what I said at the beginning about like, the goal is to have data, you know, the best data data you can trust, right? In the tools that you want to use. And I think you should build us any tool you want. But the data you can trust is key. And if you don’t have some amount of centralization somewhere in the company, then this I don’t know how to make that happen. Like to me, you get trust through central some centralization and some Federation, right? That’s just how that’s always that’s why our product is called census, by the way, like, it’s because exactly that was the intent
Tridivesh Sarangi 49:07
of ours completely agreed with you know, the data you can trust like for men, like business analytic standpoint, maybe data warehouse, but for life, like for example, MDM provider Master Data Management, or customer data product leader, right, that may be the system of truth, it’s not the data warehouse. Same goes for
Boris Jabes 49:29
I hear you, I hear you, I hear you, I think the every SAS product I’ve ever interacted with, and I think they’re just smiling because I had the same reaction is like every SAS product I’ve ever interacted with in some form on the website says something about the system of record for X for X, take your pick, I think drift one said we’re the system of record for chats or something like that. I was like what? I swear I think it said something that was and so I think trittye I’m totally on board with you know, using a sword that your company has fully bought into like, this is the truth, right? Then it’s made, then it’s greater. But in my experience, the reason people tend to gravitate to the warehouse and why we made early on a pretty hard decision to like bias towards these kinds of platforms, not to the exclusion of others, but to like, as our primary bias is that they have infinite storage, and infinite joint capability, right? And, and like that, two digits, but you can use the ecosystem for that you’re not tied to a single vendor, making sure that it’s support open source, right. And so I think that if you can get that out of something else, then great, like, well, you know, we’ll support that as a source to write. But that That, to me is the important part is that usually, you can join all data somewhere that matters.
Tejas Manohar 50:45
I agree with that fully. But I would also add that I think the even more important part is that you have data at rest somewhere in an organization that, that, you know, your business seems simply are using, I think, I think data that you can trust is definitely a huge corner, of course, you know, on these things, but the biggest thing is that before solutions like high touch or before like deed activation before reverse ETL people just weren’t using the data at all right? There’s so much you know, such a wealth of data and a lot of a lot of the Gabby’s we were, it wasn’t I think, the
Boris Jabes 51:13
central premise right to Just Between agree how we approach this is like, you’re not connecting Zendesk to Salesforce right now for sure. Neither are we. And and so I think, and truly, like, I think there’s tons of data in Zendesk that can go into Salesforce, and it does, and, and I think that’s great, but potentially keeps you away from coalescing on something that is that is, like more trustworthy.
Eric Dodds 51:41
I think the obvious for some use case, correct. Agreed. And the truth is, of course, of course, like, of course, like that goes without saying that goes without saying. Yeah, we were actually talking about that, like a pipeline that doesn’t make sense for someone to build or really for like, like animals? Well, no, I mean, just like an example of like, okay, you have, you know, leads in Salesforce, and you want to sync those to Google ads, because there’s certain data, right? It’s like, okay, well, no one wants to manage that. Pipeline great, like Google and Salesforce, built it. So you can just reverse ETL the data points in there. And then great, right, so there are like point to point connections, where it’s like, this is awesome, because like, no one has to manage this, these two enterprise companies like built in integration. And this is awesome, like, great, like, connect the tools. And the data teams like and the actual operational teams don’t have to deal with it. And like, yeah, it is very convenient.
Boris Jabes 52:32
And it’s every app if every app on Earth was perfectly connected to every other app on Earth. Sure.
Eric Dodds 52:37
That right, we’re talking about Salesforce and Google ads, right? Like, yeah, they they ship goes around
Boris Jabes 52:41
the Salesforce and Google ads. Facebook, well sit. But even even if we only focus on Salesforce and Google ads, that integration has all sorts of limitations. It currently sync, I think the last 90 days, like they all have limitations, right? Yes. Yeah. Can you just point from way earlier? Like, do you think the staff, senior staff engineers at Salesforce are working on that problem?
Eric Dodds 53:03
I don’t think so. Right? But it’s but it is, like totally a cost benefit for the data teams working inside a company where it’s like, great, we’re just gonna offload that, right. Like we can accept the limitations. I mean, our our goal
Tejas Manohar 53:13
is software vendors and data integration should be able to make it as easy to do to do that as you can do it in the Salesforce UI perfectly on top of the data warehouse. I really think that’s possible.
Boris Jabes 53:23
Absolutely. Totally agree.
Kostas Pardalis 53:24
Yeah. And there’s also a matter of like expressivity, to be honest, like, you can move data from something like Zendesk to Salesforce, right? Like, you can do Zapier,
Boris Jabes 53:36
you can do it with Zendesk natively, you don’t even need it.
Kostas Pardalis 53:39
Yeah, yeah, exactly. But the the whole point of like, working with data is like, how we can take whatever data points that we have that they store, probably, almost, let’s say, an interesting amount of like, implicit information is there and make it slightly, so we can push it and use it somehow. And to do that processing environment, right. And this processing environments, like humanity, so far, has decided that it’s going to be like a database system. Like they’re built for the division, right?
Boris Jabes 54:13
So rational concerns. Yep. So unless
Kostas Pardalis 54:15
the things that we have to do are like, super trivial, like, Okay, someone signed up, okay, let’s send this somewhere. Okay, fine. But anything more than that, that requires some kind of like, business logic to be built there and be executed on top of the data in order like to derive something. It needs to happen somewhere. It doesn’t have the pipeline, but we’ll do that. Right. Like,
Eric Dodds 54:39
it’s a great point. I guess, once it is in point to point that context is decided for you. Right, like, right, yeah, well, it’s
Boris Jabes 54:46
context but also like, I think we all know history here, right. Once upon a time, you had to put the logic into the pipe, because of literal computing constraints, like like, going back to you know, we actually had limited ability to you to move all the data out, so luckily, we now live, that’s a genuine shift technologically, right? Like now, we no longer have to pre like compute on the fly or as we move, right? So. So that is one thing where we can clearly show before and after, where compute cost went sufficiently down, that we could just store everything and then compute after. But you’re right, close, eventually, you’re going to need to compute in some form, people might not realize their computing, I found that people who use Excel don’t realize that they’re programmers, when in reality, Excel is the world’s most popular functional language, by far. Sure, you know, for all the Haskell developers out there, like actually excel those that’s yes, Office,
Tridivesh Sarangi 55:36
the generic food cue point, I just want to be very clear my business, I wish you mentioned, we’re calling sort of late you’ve never been nice, but I’ll get to the point. It says like the sending the Salesforce integration with the example that it brought about, like let’s say, you know, I go to get a get some system register as a user as a lead, right. And there is a like, the reverse ETL is not a catch all for everything. The lead get into an SDR remedy in real time, it’s a very different flow, that requires some integration automation, which may not even patch, like any data warehouse, right? It needs to happen in real time, because somebody enters the army to respond to me in less than five minutes. Right?
Tejas Manohar 56:18
We have that production database with Shelley’s actually we actually do that with a replica of our
Tridivesh Sarangi 56:23
game, saying, I’m just saying that we’re you know, different nature of the other part is like, once you’ve collected all these leads, and say you wanted to do a reactivation campaign. That’s right. And that lead engineer would be in our house and say, let’s reach out to these leads, right interested in at some point in time, but it never went somewhere. And you need to move that data into your like, motto to kind of thing. That’s what reverse ETL comes in. So there’s a place for both. Yeah, there’s a place for both, then again, a tool of choice will depend on what the user is trying to solve for. But there’s a place for both wives innocence, positive integration need to act and regardless of whether it’s illegal when necessary, suppose has the most reliable source of customer need or not?
Eric Dodds 57:07
Yeah. It’s a great point. Okay, we are we’re close to the buzzer here. So we didn’t get to talk about a number of things that I would have loved to, but we need to do q&a And wrap it up here. Okay, so we’ll just do a couple of quick questions here. The first one, which is, which is super interesting is there was discussion around I’ll give a little context here, or I’ll read some context into the question. There was discussion around the change in technology. Right. So, you know, products like personas were built before the warehouse had sort of come of age as it were. Right. So the question is, the, the, the technical ability of roles is changed. Right. So a marketer, you know, 10 years ago, was far less technical, or most of them were far less technical than today, right? And even salespeople, right? And sort of the appetite for like, different interesting types of data that help them do their job? How does that influence the way you’re building your product? Right, not only is the tech change, but like the users, it’s like, you know, marketers are very data centric salespeople are becoming more data centric, how’s it influencing the way that you’re building your products? Yes, tremendous responsibility? No, I
Boris Jabes 58:19
mean, that, like, it’s right, we, we get to see and foster basically a, you know, an upgrade in skill. And to me, you know, I think a lot about, you know, when you don’t learn just computer science by learning, you know, like, how to how to write a git commit, right? Like you, there’s theory related to that. And I find that the, I’ve always framed Census to our users as not a data pipeline, but more of a data deployment tool, right? Where I’m trying to teach you certain aspects of software engineering, without calling it that. And so, to your point, like marketers, people on data teams are all becoming dramatically more, you know, savvy, like DBT has led the way in terms of teaching people how to, like, check in their SQL models, like, That’s it, people think that that’s, like, no big deal, but it’s actually huge, right? And we’re at the infancy of that we’re at point, you know, of those 5 million LinkedIn SQL people were probably at a teeny, teeny tiny fraction, who know about version control. Right? So So I think it is it’s super exciting. And it’s like, it’s kind of a responsibility, I feel like we’re teachers as well as like, engaging with them on these these ways. So, so there’s a lot of, you know, kind of integration, like we long ago integrated with like, you know, the airflows and, and pre facts of the world, right, so that, you know, we can so that we can help marketers, or analysts or data teams who want to plug into their you know, kind of modern infrastructure candidate right so, yeah, I think it’s it’s totally informs how we think about our cool
Tejas Manohar 59:46
All right, well, Troy ossipee of every I was
Eric Dodds 59:48
gonna say try it, and then and then teach us and then we’ll do one more question to end it out.
Tridivesh Sarangi 59:53
Yeah, I’ll just had, like, what Boris said, it’s a tremendous responsibility, but it’s like for me, from a product standpoint, it is As always make recognizing that fact that people are changing, it’s not letting their skills and their need to not start it, right, it’s changing and they want to do CT. By the same time, you know, taking the technology, barriers, the skills, the friction to learn and are not part of the way and making them successful faster, right. So that’s what we focus on. And then also, the second part is not put them in a box, like if they want to do more the plot function, give them to the ability to do more for it. So it’s a balance, it’s a very hard balance, it’s read, but that it’s an important one, that the role of empowering more remote people do things and don’ts of providing the right controls. So they do it responsibly.
Tejas Manohar 1:00:46
Yeah, totally agree with everything that’s been said, I think there’s kind of a balancing act between two chains of thoughts in our in our product organization, and we’re pushing on both axes. And then they both balance each other out. One of them is like, how do we empower more people in an organization to actually perform reverse ETL. So you know, a decade ago, engineers are the only ones building the scripts to move data from the data warehouse, where people trade in like a mule, soft or something super tactical, to move data from the data warehouse into all these different systems. Now, you know, we’re allowing data analysts to you. Next, we’re gonna want marketing ops to you. Next, we might even want some marketers on the team, depending on the technical level to be able to do so it on that train of thought we ship new features like audiences that allow marketers to come in and kind of build segments on top of the data warehouse and sync those out to different tools, basically, performing reverse ETL without necessarily knowing all the ins and outs of SQL. On the other hand, we also the other kind of train of thought that we’re pushing that, at balances, the first one out of empowering business users in our product organization is really this philosophy of like, taking all the principles and all the tribal knowledge that software engineers have and the process that they have said version control, you know, observability and visibility, staging environments, kind of push into staging before production, and our our goal at Hightouch is to think, okay, what would the best, if the best software engineering team ever was to build like, you know, a script or a platform for moving data from the data warehouse into something like Salesforce? What would they build? And they’d have all those things as a part of their 12 principles of deployment, or whatever it is, and how do we make all those those aspects of a really strong data pipeline available to kind of the less technical users, whether it’s data analyst, marketing ops, marketer, without them having to know all about it. And I think if you look at our application product features, like get st. So the ability to be just using the high touch product as usual, then everything you’re doing all the configuration can be bidirectionally, synced with, like a GitHub repo is a really powerful step in that direction, where you don’t have to understand it all to start. But now you can start seeing the commits you’re making, and then you need to make a bulk change, you know, you can do that in code as well. But you can also just use the application as is.
Kostas Pardalis 1:02:54
So Alright, last question, guys. And hopefully, like I can make you promise that we will do that again in the future, because we have like, more stuff to chat about and more time. But I’d like to hear from like, all of you like one thing that you are dissipating like to come in this category. And that makes you like, really, really excited. So let’s start with boards.
Boris Jabes 1:03:18
That’s coming that I’m really excited. I mean, there’s so much, sir, I’m assuming this is not a is not a, like, we’ll talk more about like what’s happening in our ecosystem rather than just in our in our products, but I think the, the warehouses keep getting better. Right? And that is, it just enables more possibility, right. So you know, what, if you think back to when we, when we started the company, we, we liked the warehouse, because it had infinite effectively near infinite storage. And we could use it as both a source and a destination, we could actually write our, our, our information into the warehouse that we could, you know, kind of do diffs like, I think incremental six is like a table stakes thing. It’s like there shouldn’t be some fancy feature. And, but we can do so much more, given the capabilities of the warehouse, right. And again, when you’ve separated workloads and infinite storage, like there’s just so much you can do in terms of being able to create more observability more kinds of transforms, there are new SQL functions that still come out right like that are kind of really fun for people, I hope to teach people about certain approximation functions that are actually kind of neat, but for story front of the day, and of course, they’re all getting more real time more centralized, more merged cross the, you know, the lake, the warehouse, the real time systems, and I don’t think we’re ever going to perfectly intersect, right? But the beauty of the most of business, like most of us, save of like very small set of things can really handle what I would call our version of like real world real time, which is not computer real time. It’s like, you know, seconds, not microseconds and I think they’re, the warehouses are really getting there. And that I think will unlock so many scenarios in terms of you even give that example cluster, right? You said, there’s some things, it’s like, oh, they signed up, right. And but other things where you know, you need to, to do some computation. It’s like, you should not have to make that trade off. Anything that happens. You want to be able to compute on it, and you want to be able to, you know, operationalize it, you should be able to, and I think that’s why I think we’re, we’re just in a fun era of kind of warehousing and let’s call it data storage, and continually just getting better every year. And so I think it’s just such a fun time to be interspace. Yeah. And more accessible, actually. Yeah, accessibility is a great way to think about. Yep, yep.
Kostas Pardalis 1:05:43
That’s great. Pretty your turn. And I’d love to hear also, like, from your perspective, because you’re coming from the more enterprise space. So what do you see there? EPA,
Tridivesh Sarangi 1:05:54
like PTM became a thing not because PTM was a people romanticize ATM, because ETM was such a cool technology, what drove the rise of PTL are now what’s trending towards ELD is a rabid appetite for consuming data that won’t rename the data driven decisions, the business intelligence side of things. So the exciting thing, with the reverse ETL trend, and what could propel it to what ETL has been for the last 30-40 years, is the trend that we see in enterprises which is called like going from big data, which drove ETL, to big ops. Everything that we are talking about, you know, reverse ETL is just a way to move data from one place to another. But at the end of it GTM teams are trying to convert leads faster, right? Launch campaigns, more effective campaigns, right? Product teams are trying to drive growth with product lead growth and are using it to drive better experiences. Customer experience teams are doing the same thing – using data, right? So big ops is the next big thing. And reverse ETL will play a big role in that. Now I’ll leave it that that’s ended. That’s a trend that we see in enterprises.
Kostas Pardalis 1:07:05
Yeah, that’s, that’s very interesting. That’s a very interesting term Big ups. Sounds great. This was your turn.
Tejas Manohar 1:07:14
Cool. Yeah. I mean, honestly, on the technical front, I think I was in our thinking like here, I’m really excited for the data warehouses to you just get better and better. I think streaming data warehouses, something we’ve always been excited about. There’s some players like materialise, that are, you know, building the ability to give us a SQL database of view that’s defined in SQL. And as the data comes in, it’s incrementally processed so that a system like high touch, for example, which is subscribed to that and automatically for, you know, what cohort, or what audience the users in based on a simple formula, all these controls. And while that’s on innovation, and high dose that unlocks, you know, massive potential for high tech studies for use cases, like on site personalization, in real time that, you know, it’s harder to use high touch for today, as kind of tricky mentioned. But on the diversity of product run, I think really, what I’m excited about is just more of the design aspect of things, actually, I think a big bottleneck to making reverse ETL and data activation, the big thing and allowing all companies to use it to actually drive more value from the data just making it easier to use. So I can’t wait until there’s you know, when every business easier an organization feels like, just like how they can open a link to like a Tableau dashboard that someone hands them, and they see see the graph on it, like I can’t wait till they have a problem. Like, I wish I had the state of point of this tool, or I wish I could grab users that meet this criteria. And they can walk into it data activation tool that they have never used before the organization, it kind of, I guess connected to the resources that exist around the company quickly will metadata from all those different systems, and helps guide that user through actually solving their business level use case. And I think the a lot of the innovation in the space outside of the technical front will actually just be on the design of the products and making it really accessible and separating technical concerns for business concerns. So that business people who identify a problem can just like solve that problem. And that’s where we spend a lot of our headspace, honestly, and I think it’s a function of both marketing. Reverse ETL is not the most accessible term yet to everyone, as well as product and partnerships. And that’s what I’m most excited about.
Kostas Pardalis 1:09:18
Yeah, 100% and I think I mean, we don’t have time today, but there are a couple of things that we didn’t manage like two dots and discuss about today. And I think one of the most interesting is the users behind always maniacally fits by rivercity l what are the personas like what do you see they’re like, and Boris like you mentioned like this 5 million people there that they know suit well, but they are not technical, right? Like what neither Nick bash? What’s What’s the journey to enable these people like to do more with less technology? Yeah, yeah. Yeah.
Boris Jabes 1:09:59
I think you know, The they don’t just hire us for moving bits, right? Like they’re hiring us as a piece of software, but they’re, they’re trying to increase their impact, do more with their data like, that’s exactly. Yeah,
Tejas Manohar 1:10:14
calling Business Girl Ops is
Tridivesh Sarangi 1:10:16
just a monologue thought unipi just mentioned two things that triggered some thoughts on this app. First, he mentioned one thing that was very important, like, it’s that the maybe bringing it to like the total bit how it fits into your overall data strategy, or how you’re thinking about data in the company. And the second part, like we just mentioned about this about, like the ability to publish and subscribe, like events, right? And that’s a trend, like we seen enterprises, which is this event driven architecture, right? It’s not, it’s not just connecting your, like, data warehouse cloud or whatnot to business applications, but the ability to like, stream events, right to the bus, and then consume it across various drivers. The decouple that architecture, that’s a big trend as well. There’s no that and and, again, the maturity levels of price and price varies, but that’s definitely the trend.
Kostas Pardalis 1:11:10
Yep. Yep. Makes makes a little eight inch. Someone tried to say something, and I think interrupted,
Eric Dodds 1:11:17
I was gonna say, and I think it’s telling her way over time where we’re at, we’re way over time. But I think, you know, I’ll try to summarize at least one of my primary takeaways, what’s so interesting to me is that reverse ETL is, it’s almost a misnomer, in that it’s just sort of moving the data. It’s moving the data, right, like it’s a pipeline and sort of describes like, a flow of data. And what we’re talking about here is, is far, far deeper than that, you know, impacts the organization as a whole. And I think reflects changes in the industry as represented by both technology and then you know, the, the changing skill sets of people. And so this has been a true treat to hear about how everyone’s thinking about that. So thank you, again, thank you for going long. And let’s do this again, and dig into users. And then of course, we didn’t get to synthetic events, my favorite topic when it comes to reverse CDL. So we’ll do it again. In another couple months. Thank you, everyone.
Tridivesh Sarangi 1:12:14
Thank you. Thank you.
Tejas Manohar 1:12:16
Thanks for having us. Yeah, slot one thing, you guys,
Eric Dodds 1:12:19
we get to talk to some super smart people costus, which is honestly, like maybe one of the highlights of my week, just like just being able to ask good questions, sir. Or at least what I think are good questions, or at least my own curiosities to these brilliant people building stuff. I think what was so interesting was we really didn’t, you know, if you went back and listen to that conversation, and you didn’t have the context for it, you might not necessarily think it was solely centered on reverse ETL. Right, when we actually didn’t talk about sort of, like, the actual technical flow of data from like, a row in a warehouse table to, you know, like a field in some sort of downstream tool. And I mentioned this at the end, like reverse ETL is like a, it’s a strange term in that way, right? Because the way that they’re thinking about this problem is so much more comprehensive than, you know, just sort of a basic pipeline that’s moving data from A to B. So yeah, it made me it made me think even more about, you know, your point that the name for this maybe is not a great name, would you take away?
Kostas Pardalis 1:13:22
Yeah, it’s not a great name, my main takeaway is that we need to spend like more time with these folks like discussing about, not just like rivercity l, but the whole, it’s a transformation that the data infrastructure is going through right now. Like, for example, your show, that’s one of the most exciting things, they talked about where it was about the latest developments in data warehousing, right? And like, what does this mean? Or what it means to have, like, so many people out there that they know, or they say that they know, SQL, but they are not like technical rights. And we still have like so many people that they technically are doing functional programming through Excel sheets, but still, they are not using like all these amazing technologies that we are talking. Yeah. So the potential is obviously like, huge out there. And we are still very, very early. And what I would like to add like to what you said about like speaking with very smart people, I would say that, like what I find, like, extremely fascinating is that it’s not just like they’re smart people, they’re also like highly motivated people. That’s what makes like things even more interesting because we have people that are trying to change the way that we are working with data. And it’s very early so I don’t know I find it like culture that’s as an amazing opportunity to take a glimpse in the future when we come get like all these people together and like softball games. But we wouldn’t be able to do it more often in the future.
Eric Dodds 1:15:02
I agree. All right. Well, thanks for joining us. Lots of great recordings coming up, so make sure to subscribe and we will catch you on the next day dissection. We hope you enjoyed this episode of the datasets show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me Eric DODDS at Eric at data stack show.com. That’s E R I C at data stack show.com. The show is brought to you by Rutter stack, the CDP for developers learn how to build a CDP on your data warehouse at Rutter stack.com