This week on The Data Stack Show, Eric and Kostas talk with Jim Walker, the VP of product marketing at Cockroach Labs, about distributed systems, competing against the speed of light, and making data easy.
Highlights from this week’s episode include:
The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
Eric Dodds 00:06
The Data Stack Show is brought to you by RudderStack, the complete customer data pipeline solution. Thanks for joining the show today. Welcome back to The Data Stack Show. We have a really exciting guest from a company that Kostas and I have talked about a ton, which is Cockroach Labs. And Jim Walker, who is from their product marketing team is going to join us on the show today. My burning question, Kostas, is and this is not going to be a surprise to people who have been following along for a while … Jim was a developer before he got into marketing, and so we’ve had several people on the show who have sort of crossed lines between sort of marketing and engineering and technical roles, and so of course, I want to ask him, what lessons he’s brought from engineering into marketing. What are you going to ask him?
Kostas Pardalis 00:55
Oh, I have plenty of questions. CockroachDB is a very interesting piece of technology out there. They have done amazing innovation, each one of these companies that they are really like on the borderline between doing research, and at the same time productizing it. And so there are many questions around the database systems, distributed systems, and of course, what the vision of the product is in the company. So yeah, I’m super excited to chat with Jim today.
Eric Dodds 01:24
Great, let’s dive in. Jim, thank you so much for joining us. Kostas may have mentioned this in your chats before the show, but he and I actually talk about Cockroach Labs a lot just because we admire so many things that that company is doing. And so it’s a real privilege to have you on the show. Thanks for joining us.
Jim Walker 01:44
Well, thanks a lot. I’m fortunate and privileged to be an employee with a good group of people. So I hope I represent them well here. I’m happy that that’s what you guys think about us. It’s a fun place.
Eric Dodds 01:55
It seems like it. Well, why don’t we start with … I think a lot of our listeners are probably familiar with what the company does, but we’d love to get to know you a little bit better. And you have a background as a developer, but now work in marketing. And that’s absolutely something I’m gonna ask you a lot about, but would just love to hear a little bit about your personal history. And then for those listeners who might not be familiar with Cockroach, can you just tell us a little bit about what the company does and what you provide?
Jim Walker 02:23
Sure, yeah, that’s a lot, so I’ll try to be somewhat brief so we can get into some technical stuff and some other concepts. But I mean, y’all, I started as a developer; I mean, I coded at the age of 11. It was, you know, the early 80s; I had a Commodore 64. I was just always into computers. I always had a kind of creative side of my life as well. But I kind of landed in electrical engineering, computer science in undergrad, graduated, I loved it, and I ended up being a programmer, and I coded for seven years, professionally, in a language called Smalltalk, which I will argue is still the most elegant and beautiful language ever created. And I was in C++ and C, and, you know, I was a developer and I was working as a consultant, and every time a salesperson opened up their mouth, with like, two months of scope to the project, and it just frustrated me every time.
Jim Walker 03:14
And so, you know, I ended up being the person that they would put in front of people to actually explain what was going on. And, and I loved it, and I love kind of taking what, you know, developers and what we were building and explaining that to people when they got it that “aha”, when they understood, it was just, it was it was the juice, man, I loved it. And so, you know, I naturally kind of gravitated towards product marketing, because well, as a developer, I mean, I loved it, but I was a hack. I mean, I was good. But like, you know, I was, you know, I was managing teams and stuff. So it was just a real natural fit for me to kind of move into product marketing explicitly, because I always feel that it’s my job to be a translator of deep technical concepts into English so that people can actually understand these things. And I love it. And and, you know, when something clicks, and something works really well, there’s nothing like it. Now I’ve been kind of in startups just, you know, gosh, I think this is my 10th or something like this. All of them, except for one, have been successful. I’ve been in security. You know, I was doing Master Data Management, which must be familiar to you guys about customer information. I was at a company called Initiate. I’ve been in open source for a long time, I was at a company called Talend doing data integration. I started their MDM project there and moved them into big data. I was at Hortonworks very early and helped to find the Hadoop space. From there I was at another little marketing company and then and then I landed at CoreOS, which CoreOS was a real special company that really kind of innovated in some different ways and really built kind of a foundation of a lot of things that are happening, I think right now in infrastructure and that was a joy and I landed at cockroach really about two and a half years ago.
Jim Walker 04:51
Cockroach Labs, we’re the creators of CockroachDB. This is a brand new approach to building a database. You know, we architected the database from the ground up to be distributed. It’s basically taking all this distributed systems and distributed thinking stuff, applying it to the database so that we have a database that’s kind of prepped and ready for modern applications as you know, we move quickly into this kind of next generation of distributed systems and cloud based applications, whatnot. There’s a lot there.
Jim Walker 05:20
I mean, it will, I’m sure we’ll talk a little bit more about Cockroach. But that’s kind of a quick overview. Is that sufficient, Eric?
Eric Dodds 05:27
That was sufficient and efficient, amazing, awesome hearing about your background. And one quick note, the image for Smalltalk from the Smalltalk book on the Smalltalk Wikipedia page is awesome, and would make a great poster. So go check that out everyone who’s listening. That’s pretty great. Very cool. Well, Kostas, you have tons of questions on the technical side and I’m really bad about stealing the mic at the beginning of the program. So I’m gonna hand it over to you and let you dig in.
Kostas Pardalis 06:01
Thank you Eric. Thank you so much. So Jim, do you want to give us a quick overview about ShareDB, CockroachDB as a product and as a technology?
Jim Walker 06:10
The founders of this company, you know, Spencer Campbell, Peter Madison, Ben Darnell, you know, all three of them spent a considerable amount, in fact, they all landed at Google at the same time. In fact, I think all of their employee numbers are around 300. And, you know, they spent a lot of time there, you know, Ben was responsible as a big part of the reader team. Spencer and Peter actually met in college I think the 90s, and they actually built out something called GIMP. A lot of people are familiar with it, but, you know, open source image manipulation tool. They’re the founders of that, and, gosh, that thing’s still going on. So these guys have been together for a long time. But you know, it’s interesting to see what’s happening across the board, in everything right now as kind of the world coalesces around a lot of the innovation that happened at Google in the 2000s and 2010s. And, you know, led by Jeff Dean and Sanjay Ghemawat and, you know, Eric Brewer and some of the kind of leading minds over there, but you know, just look at the things that have come out of that stretch of time. It’s amazing. So, you know, Spencer and Peter kind of front row and Ben as well, front row, looking at all this stuff. And you know, when they eventually left, they were off building a startup there, I think they were building a photo sharing startup. And they were frustrated, because they didn’t have they didn’t have a Spanner-like database, right, they, you know, they didn’t have a, you know, comparable version of Big Table, right? They were kinda able to use those things, but they were frustrated. And so it’s really kind of out of frustration that they ended up starting to build Cockroach Database. And, you know, honestly, it’s, you know, the name is, is, you know, really kind of after the resilient nature of our database, you can’t kill it. But I think Spencer has a little bit of a dark humor. So I think that’s where the name came from. We love it. Honestly, love it or hate it, people remember it, that’s for sure. So they basically took the Spanner white paper, which you can go check out on the Google publication site, and built an open source version of that they built something that wasn’t going to be dependent on, you know, explicit hardware to do certain things. And they built a database that was massively distributed but built to scale, very easy, survive any failure, even a region or even, you know, anything, a Kubernetes cluster for that matter. But most importantly, being able to tie data to a location, which is actually one of these concepts, I think, is not understood by a lot of people when they start thinking about distributed systems and distributed data. Distributed means, well, you have to take into consideration the physical location of things. And so you know, typically, when you prop up a database, you think about the logical data model, you know, here’s my keys, here’s my referential integrity, I, you know, figure out what’s going on with all my tables and whatnot. With a distributed system, you have to think about the physical nature of the database, the physical model, as well. And I think that’s one of those core concepts. And so being able to tie data to a location is kind of a critical piece of CockroachCB, because, well, it allows us to fight latency issues, you know, put data close to users, it allows you to survive the failure of an entire region, whatnot, but all the while let’s do this in a database that’s built from the ground up, just you know, completely new, this isn’t kind of moving improve this is brand new from from bit one all the way through and make it SQL and make it you know, wire compatible with Postgres so it’s familiar to developers, and they could get running. And so you know we feel we’re defining this next generation of database for transactional workloads, and it’s called distributed SQL.
Kostas Pardalis 09:33
That’s super interesting. Two things, actually, one, I found very interesting what you said about the location, and I want to ask you about that to give us a little bit more information around that. But before we go there, I know that one of the most important, let’s say the biggest trouble that people have when they architect and they build distributed systems is time, right? And how you deal with the dimension of time. And I know that there’s a lot of innovation around that DB has done and there are differences compared to Spanner. I mean, you mentioned something about the specialized hardware. Do you want to give us like a little bit more information around that? Because I think it’s super, super interesting.
Jim Walker 10:10
Yeah, it’s way deep in the technical and technical nature of our product. And it’s, you know, if you, if you start thinking about transactions in the database, now, you know, can you do transactions in something like Mongo, or, you know, another database? Well, you’re gonna get to being kind of eventually consistent. And it really comes back to how you use the algorithms that are in front of you to actually execute transactions. And so you know, for Cockroach, we chose to be, you know, to implement serializable isolation by default. And actually, for all transactions and to serializable isolation, it just means that, you know, every transaction is going to happen, kind of, you know, atomically, in order, right, it’s an acid concept. You know, I think Kyle does a really good job on the jepsen.io website, talking about, you know, all the different levels of isolation that you can have in a database, if anybody’s interested it’s really cool stuff. But serializable isolation is a big deal. But in order to do that, you know, to have multi version concurrency controls, well, the clock and the time actually becomes really important, because that is what really, you know, demands, you know, that things are happening in order. Now, in Spanner in the original Spanner architecture, well Google had relied on hardware atomic clocks, right. So, you know, if you can just align every server to have the same exact time, well that’s great. Well, as you know, and everybody knows, I mean, you know, there’s no such thing as you know, true time on every single, every single server, I mean, we kind of get there with some of these true time services and stuff. But, you know, for us, we want it to be independent of any sort of hardware or any sort of other service. And so we basically built from the ground up, we said, Look, how can we actually, how can we do this a little bit differently? And so, you know, actually probably one of the most popular blog posts that we ever wrote was “Living without Atomic Clocks.” And, man, that blog post on our website does a really good job of describing this more in depth. But basically, what we said, we said, How can we use software to actually get the same sort of thing. And so you’ve got to start with one, some sense of time. So kind of start with something like NTP, which is like a Network Time Protocol, it’s been around for a long time, and then build up some logical drift around that. So that, you know, servers can be off by I think it was a, you know, it’s like 15 milliseconds, or whatever that is, but, and then use gossip in raft to actually start to understand where all these nodes are, from a time point of view and correct them as we need to. And so, you know, you can do this via hardware. But you know, as really clever software engineers, we chose to actually solve the problem so that, you know, it can be wholly owned, and in the binary itself of Cockroach, which comes back to this kind of concept of distributed systems and containerizing, and everything. And like, I don’t want to be dependent on anything that is external, I want a single binary. And so that’s kind of why we chose to do it. Now it allows us to be deployed anywhere, right? And so that that’s a that’s a big, big value for what we actually did.
Kostas Pardalis 13:01
Yeah, that’s amazing. And I think ,if I’m not wrong, and you can correct me if I am, that the time restocks actually also exist in Spanner, right? Even with atomic clocks. It’s not like they avoided completely just like, it’s a very, very small, like, drift that they consider. Yeah. Yeah. Yeah, that’s, that’s, that’s amazing. And I’m totally aware. And I think our audience is probably also aware of this great blog post. It’s very popular. Okay, we talked about time, but you also mentioned space, right, like location, actually. And that’s something that’s okay. Usually, as I said, previously, we’re not talking about the studio systems and would tend to focus a little bit more on the consequences of time there. So why is location important? How does this affect CockroachDB and how does it make it special as a problem as a technology?
Jim Walker 13:50
You know, the speed of light’s no joke, man, right? Like maybe one day, we’ll figure out how to beat it. But we ain’t. I don’t think it’s gonna happen in my lifetime, maybe some quantum thing will happen. But you know, it’s funny, internally, you know, people ask us about who is our ultimate competitor. And you know, in our engineering team, our ultimate competition is the speed of light. And we do lots of things within our software to actually, you know, fight it, and to work with it within the context of it, right. And so, when you’re deploying a database, and you want consistent transactions, serializable isolation is going to guarantee the data is correct, right? We’re talking about, you know, financial transactions across a planet. Look, if we’re going to be really good in a single data center, there’s lots of reasons why people would use us in a single data center for a simple application. But when you start to kind of, you know, build something that is the next generation of database for these mission critical workloads, you know, the stuff that’s been wrapped up in mainframes for years and years, you know, having consistent transactions is really important. But having global access to this is also critically important especially in the modern, you know, world and basically businesses everywhere. But here’s a problem, you know, because the jump from New York to Singapore is going to be what, 500 milliseconds sometimes or maybe 300? And what happens when a transaction takes, you know, two or three hops back and forth, you know, we’re talking about a second or two. It just doesn’t work in certain workloads, you know, the, again, another Google thing, I always forget the guy’s name, Paul Bechtel, or I forget his name. One of the guys, one of the original guys that worked on Gmail, came up with this concept called the 100 millisecond rule. And the 100 millisecond rule basically states that anything that happens, you know, sub 100 milliseconds appears to be in real time to the human. Anything over you can actually notice the lag a little bit. And so for us, it’s like, how do you get all transactions to be under, you know, sub 50 millisecond? Well, the only way you’re going to do that with data wrapped around the entire world is to make sure that data is located close to that user. Now we use Raft as a distributed consensus algorithm that many are familiar with. Anybody in distributed, if you don’t know, Raft go check it out. I mean, gosh, it’s like, yeah.
Kostas Pardalis 15:58
I think Raft and Paxos are like, the two most commonly referred algorithms around distributed systems.
Jim Walker 16:05
That’s right. And we’re using Raft to kind of place data, you know, these replicas around the world? And, you know, how do we actually, you know, make sure the Raft leader or the lease holder, as we call it within within Cockroach, how do we make sure that that’s close to your users, so that, you know, all transactions to that Raft group are going to happen very close to that user. And so, you know, we went to great lengths, basically, within the way that we store data using, we’re using Raft and distributed consensus to make sure that data is going to live close to users, because you know, I want to guarantee, you know, sub, you know, can we can we tune our database to get, you know, sub 10 millisecond transactions to every single, you know, transaction, no matter where it’s at, on all the tables, yeah. But on the other side of that is, sometimes you don’t, sometimes you just need data to be, you know, accessed all over the world. And so, you know, we chose to do this at the table level, you know, and so for each table, we’re defining how data is actually persisted within, you know, the physical nature of all the nodes within Cockroach. And so, it’s pretty simple to do, it’s, it’s a pretty straightforward process, we’re actually going to simplify that a whole lot over the next couple of weeks, we’ll have a release come out that, that really kind of, you know, breaks it down into some really kind of simple declarative kind of SQL statements that allow you to, to, to define at a table how you how you want data to, to live, so that it’s easily accessed or quickly accessed, or how it wants to survive.
Kostas Pardalis 17:22
That’s great. Actually, as you were talking about speed of light and location, and where the data is located. I couldn’t stop thinking about the end, it ends up to physics again, right? Like, you have space, you have time, but at the end, you need both of them to solve the problem.
Jim Walker 17:37
That’s right Kostas, and it’s really difficult, but the team of engineers, I mean, some of the stuff we did, some stuff we’ve done is just truly remarkable. One of the things they worked on was this, there’s this feature in Cockroach called parallel commits, and look , I’m gonna say I’m just a marketing guy, I’ll explain the best I can. But we actually have a sigmod paper that gets into this pretty well, that’s published and it’s available on our site, but the team basically they said, Look at how can I actually forward commit a transaction before, you know, and just in, say, with five nines probability that it’s gonna commit on the second node. So if I can actually go through and look at basically, if I could commit a transaction locally, and then look at the data around that transaction and say, hey, look at I’m going to send the transaction and the picture of the data around that, I’m going to send that to the second node. And if the second node takes it, looks at all the pictures around it and says, Yeah, everything looks the same. Just forward, commit, and just say, it’s done. Like, instead of doing all the transactional steps within each, you know, within that thing, just come back and say, yes, acknowledge it’s going to be, it’s going to be fine. That is awesome, right? Because that’s, you know, you can’t you know, it’s the speed of light, you can’t change the photons, but you can’t, you know, but maybe you could change that package what’s in there, right. And so, it’s a different way of thinking about things. And that, you know, has made huge, huge gains for us, as we kind of, you know, ratchet down on latencies and continually fight the speed of light here.
Kostas Pardalis 19:02
Yeah, yeah. I think it’s more than obvious that there is some amazing engineering behind CockroachDB. So alright, we’ve talked a little bit about all the amazing stuff that is happening behind the scenes and what the problem is that the Cockroach Database is trying to solve. Who should be using this database? Because, okay, we know that every engineer, I mean, you as an engineer, and me, as an engineer, we know that we always like to play with new toys and exciting technologies, but who should invest in building applications on top of CockroachDB today?
Jim Walker 19:35
You know, I mean, I, you’re asking me: everybody, right? Like, I work here. So you know, well, I mean, look, you know, look at if we’re going to be wire compatible with Postgres, if it’s basically the same syntax, but you’re gonna get basically all this value of never having to manually shard a database, never having to think about setting up any sort of active/passive resilient system like I mean, why wouldn’t you use this for a simple application, right? Like, yeah, sure, you could spin up RDS Postgres and it gets going pretty quick, you know, like, in a single region, it’s pretty cool. But like, literally, like the complexity of actually dealing with some of these kind of day two operations stuff is it’s, it’s, it’s killer. And so, you know, for us, it’s really any application. Now, that said, I mean, we’ve got some pretty world class customers out there. And, you know, some of them that I can actually talk about, you know, DoorDash is a big customer of ours, Lush, Bose, Comcast, you know, LaunchDarkly, which is a great, you know, dev tool, right, like, so there’s, there’s a lot of really great logos. You can go to our website, we have a lot more, but you know, they’re looking at us as kind of the next generation of database. DoorDash is a really good example. Height of the pandemic a year ago, you know, they’ve had it, they had a couple of issues with some outages of the database that they had using, because they had kind of a write bottleneck, right, like, you know, you know, if you’re going to be distributed, then every node can take a read or write that’s our theory. Well, some other databases like single write node, but read nodes all over the planet. And, you know, that was just causing issues because they had downtime and you know, and then if you fast forward a couple months, where they’re gonna IPO, they can’t have crisis, they can’t have downtime, like, that’s just gonna, that’s gonna have an adverse effect on any sort of, you know, what they were trying to do. And so, you know, midstream, they chose to move to Cockroach and, you know, fast forward a year, and, you know, a lot of their transactional workloads are now either moved or moving to Cockroach and Cockroach Database. And, you know, and then they set it up internally as a service for the new developers to use this, because you’re right, Kostas, developers do want to play with the latest tech, you know, and we want to take away all of the complexity, and make data easy enough so they can focus on their breakthrough application, and build. And I think that’s what some of these organizations are, so you know, time and time again, we keep seeing companies, you know, set us up as a service internally for them for all net new workloads. I think we fit net new workloads really well, like, you know, it’s some of the legacy migration stuff–it just wasn’t built for the distributed world, you know, when you start using stored procedures, and you have all these kinds of crazy concepts in there. You kind of have to think differently. And so, you know, we fit really well in these net new workloads. I think anybody who’s thinking about Kubernetes, or deploying anything on Kubernetes, forget about it. There is no other database that was built directly for Kubernetes. I mean, this is descendent of Spanner, just as Spanner was built kind of for Borg. Right. And so I think that’s where we’re seeing most people kind of turn to us.
Kostas Pardalis 22:26
Yeah, you’re absolutely right. I think I didn’t ask the question probably that well, but the point and what the reason that I asked is because, you know, like most people, they have in their mind that when we’re talking about distributed systems, like distributed databases, we are talking about very niche problems out there, that huge companies only have to deal with, or it’s about crunching a lot of data and doing analytics. But I think it’s important for all the engineers out there to understand and to communicate with us like the vendors to them, like the benefits that they can get by relying on technologies like Cockroach DB at the end. Regardless, I mean, of let’s say, the complexity or the size of the projects that you’re running.
Jim Walker 23:05
And Kostas, it’s such a big piece of this, and so I keep getting into these conversations over the past couple years. And it’s like, shifting to a distributed mindset is not easy. Like it took me a while to figure it out. And I think that’s the thing. And it’s like, it’s not just operations, it’s not just infrastructure, but the developer has to think differently. And I think that’s where we’re, you know, I think we’re going to be in a different world four or five years from now, when this is kind of the de facto way of actually building you know, but we’re in this transition mode. And I think that’s where people you know, think like, if maybe it isn’t for me. Well, it is, and this is the future. This is what’s happening, y’all.
Eric Dodds 23:41
So question for you. And that because I totally agree. And I’ll give a quick story here as a background as kind of a lead into the question, I was listening to a podcast with the guy who started Spotify, and he was talking about the early days, and they were trying to replicate the experience of having music downloaded directly on your hard drive. And so his goal was sort of, can I create an experience that is better than you know, sort of going on to Napster and downloading a bunch of songs directly on your hard drive, and he ran into the 100 millisecond problem, and actually had to introduce more latency in the experience and sort of make it seem like it was taking a little bit longer to give people the impression that it was sort of, you know, being downloaded. But I was thinking about that relative what you’re saying. And back then, I mean, they started in gosh 2005 or 2006. And there were really significant technical limitations relative to what’s available now. But I think we’re moving into a phase where the expectation of the consumer, but you know, sort of in a DTC model, and even more and more in B2B is that there is no latency, right? It’s just everything is instantly available. So, all of that to say the question is, how much are you seeing when you talk to people who are looking at migrating or starting to think in a distributed mindset, how much of this is that pull coming from the consumer just demanding a better experience because they’re starting to get it with the services they use most often. And so smaller companies are having to replicate that or get as close as they can.
Jim Walker 25:14
Yeah, that’s a really great question, Eric. And so I think of this as kind of three pieces. That consumer experience is a big deal. It’s not a big deal for every application, but when it is, it’s a big deal. And so it really comes down to the workload. I think the other thing that is a big kind of weight here is, you know, we started, we figured out big data for analytics, we never really figured out big data for transactions. And this kind of, like, it’s an accepted concept, but like, we still need it, it’s almost like transactions were ignored. And everybody wants it. And so basically, there’s this big push there. But I gotta tell you, the big reason that people are turning to this and other distributed systems, is because the cloud is awesome. And yeah, I get CAPEX, OPEX gains, and, you know, it’s everywhere, and I got, like, these great services, but like, for the core kind of concepts of cloud around scale, and resilience, and kind of, you know, exposure everywhere, all across the whole planet, we’re still limited in many ways, you know, you know, for the, for the general purpose, you know, user of the services, right, like the, and for us, it’s like, well, if that infrastructure is changing, for us, it’s the the equation is like, where does infrastructure end and your application begin? Right? And for me, I always thought that the database is part of the application, right? That’s just the way I was raised. The first thing I ever built was on FoxPro, it was in the database itself, right? You know, like way back in, like, that’s just not the case, the database is infrastructure. And so the last piece of infrastructure that has to move towards this distribution as we accelerate in this world, is the database man. And I think that’s the piece that, you know, being truly distributed is a key thing. So I think that that consumer experience is a big deal. It’s not a big deal everywhere. But I gotta tell you, I think, increasingly, that expectation of instant access, and it’s got a, it’s got to be as good as Instagram or Facebook or whatever I’m using as a sure thing like, my mom, my mom’s happy with that. Right. So we just got to be the same.
Eric Dodds 27:24
Yeah. And why do you think transactions were … you said … it was interesting, you said, it kind of seems like they got ignored. Why do you think that happened?
Jim Walker 27:34
Cuz it’s difficult, Eric. It’s really, like, this is not simple stuff to solve, man, like, look at. So I was at Hortonworks. And, you know, the team, again, an amazing group of engineers. I mean, you know, Owen O’Malley was like, troubleshooting the Mars rover when it was down, talking to the guy one day, and he’s like, I’m like, dude, you fixed the Mars rover? He’s like, No, I just fixed the scheduler. And it’s like, okay, dude, like, you know, like, some of the stuff that was going on. But like, I remember Alan Gates is working on Hive. And you know, we had, you know, how do we provide transactions? There was like, the Impala thing going on at Cloudera, Hive, LLLAP was that approach. I think we’ve kind of retrofitted transactions into the no SQL databases, it just, you can’t take existing concepts … like so why did Google build Spanner when they already had Big Table? Right, like, well, because it’s a fundamentally different problem. And to solve it, it takes a rework of the entire stack, like, you know, from storage, the way that data gets written to disk, through the transaction model, and being a distributed transaction model, distributed transaction, execution engine, all the way up to the way that the language works, and how these things happen. And so it’s a complete rework. And, you know, Stonebreaker, who, you know, Michael Stonebreaker, if people aren’t familiar with him, he’s kind of one of the godfathers of all databases. I mean, you know, started Postgres. By the way, y’all, if you don’t know who Stonebreaker is, Stonebreaker says it takes seven to eight years for a database to fully gestate and be, you know, really kind of valuable for large scale kind of operations and whatnot. Well, if it takes seven, eight years to build a database, you guys, and I hope your audience, building distributed systems is also difficult. So let’s put those two things together. You know, and it’s not, it’s not, you know, if it was easy to build a quarter of a database, everybody would be doing it, but it’s the corner case. It’s the real weird, odd things that happen. And for databases, those are difficult things to solve. And so I think it’s that’s why it’s, it’s a really, really difficult problem.
Eric Dodds 29:35
So it was less about, if I had to summarize that, it was less about the advanced optimization of existing systems and sort of rethinking the fundamental architecture of how of how it actually works.
Jim Walker 29:49
That’s right. Absolutely. Thank you. You just took my three minutes and broke it down into 10 seconds. That’s exactly right, though.
Kostas Pardalis 29:58
Eric. I think you did the very good thing of moving like the conversation also from the side of the end user at the end. But I want to see the discussion a little bit back again to the developers and discuss a little bit more about the experience that the developer has with CockroachDB. So you mentioned that you are Postgres compatible, but what an engineer should expect by interacting and using and integrating cockroach DB today.
Jim Walker 30:24
So great question. So, you know, we’re wire compatible with Postgres, right? We’re gonna speak SQL syntax. So if people understand SQL, they’re gonna get us. If they’re using rmws, you know, we’ve built out, you know, a lot of ORM integration. So if people are doing that sort of stuff, first of all, so, number one, it’s pretty similar and familiar to that experience. But there’s concepts that are different when you’re dealing with distributed data, and kind of distributed systems, you know, if you’re going to have a serializable isolation database, well, you know, as a developer, by the way, I never thought about isolation levels. I was like, whatever, in fact, it was all new to me. You know, like, again, I was a hack, like, try … catch blocks, you guys want what? What do you want me to do? Like, just let me deal with logic. And so you know, some transactions are going to conflict. And you know, implementing best practices around try … catch is a big deal. Right. So that’s one thing. I think another big thing is actually when you start to think about how data is stored in Cockroach, you know, we get a lot of conversations with customers about unique IDs. And a lot of times you will see tables that just increment values for unique IDs. And that’s actually an anti-pattern for distributed systems, because we’re using that unique ID to actually, you know, to distribute the data across the cluster. So you don’t want a hotspot, you don’t want all records in one range, right? And so, there’s like, another layer down, which it’s a little bit deeper, but you know, using UUIDs to do that is actually a big deal. But I think one of the things that’s most important for the developer, let’s start thinking about distributed data, is how you construct your transactions. And Shawn at doordash, I was on a webinar with him a couple weeks ago, it was a really great example. It’s like really crystal clear, you know, if you’re going to insert 10,000 records into a table, right, like, okay, yeah, Postgres insert, here’s the records, it’s optimized, man, it’s gonna, it’s gonna fly through that, right? It’s gonna just append that data updates the indexes? You’re good, right? Well, in a distributed system, you don’t really want to do that, because you’re going to overload kind of one node, right? Like you just basically overload and it’s trying to communicate with all the other nodes. Wouldn’t you want to execute that as, say, 10 transactions, each of them with 1000 inserts? Right? So you get the parallelism of basically, you know, multiple endpoints all working on this, right? Because any endpoint in Cockroach can, you know, receive and process reads and writes? Right. And so a little bit of this comes back to what we were talking about before this, like, distributed systems require a different mindset. And, and I think that’s the stuff that is interesting to me, and fascinating to see in the developer community, how, you know, people are starting to come around to that, like, it’s a different way of thinking when you code and interact with things on the back end, you gotta start thinking about location and that sort of stuff if that makes sense.
Kostas Pardalis 33:05
Yeah, absolutely. And I really love that you keep saying about this change in mindset, because I think, I truly believe it’s a very important thing that’s happening and continuous should learn about it. And based on my experience, and my exposure to distributed systems, one of the biggest, like revelations that I got from distributed systems is about designing systems, with having in mind that things will go wrong, things going wrong is not the exception, right? It will happen, right? And that’s a big part of like trying to build distributed systems on what are all these edge cases? And how can we? What are the limits? And what can we do to secure our data transactions and the behavior of our systems. And when we are dealing with all these problems. And I think that one of the bad things that happened because of the introduction of the cloud, and unfortunately, this is also part because of marketing, is that the cloud was also evangelized as a solution that takes all the hard problems away, right? Like, I can have my servers there, I don’t have to worry if the hard drive dies, right, my file system and servers will be still running. But again, this is not the case. And actually, I think that whatever happens, like there are many failures that are happening on cloud. And as you’re dealing a lot with resilience, and you also see from large scale deployments from your customers, how often do you think that you know that like this is happening and how much of a problem it is for an engineer to keep in mind that failures will happen. It doesn’t matter if we are on Google Compute Cloud, or like on AWS, we have to build all the components of our systems around the concept that something might go wrong, and we have to be ready for that.
Jim Walker 34:43
So first of all Kostas, you’re going to go and attack the marketing guy? Really, you’re just going to blame it on marketing, buddy? Come on.
Eric Dodds 34:53
I actually went the reverse way. I came from marketing and we always blame marketing, Jim.
Jim Walker 35:01
I know. Look, some marketing organizations do you know they’re gonna they’re gonna go a little too far. And I think there’s definitely some of that Kostas. I know exactly what you’re talking about like their delivering or they’re selling a promise with something that’s just not a reality or if it is it’s really difficult to attain, right? And I think, and you’re right like this, this distributed thinking and distributed mindset requires you to basically build for resilience. That’s the concept like, that’s the thing, like, there’s no such thing as disaster recovery, because disaster should have no impact, right? Like, and so how do you design for that, and well, that takes a hold. This is what I mean by you have to re-architect you have to rethink everything that we ever thought about before as kind of architects of systems, you kind of re-architect, you architect for resilience in the system itself. And I think that’s kind of one of these core, this is, again, one of these kind of core concepts that came out of the Google team over the past, you know, 15-20 years. And I think, you know, there’s there’s a lot of, there’s a lot of research, there’s a lot of technologies, a lot of software engineering in this, you know, I mean, understanding Raft and Paxos understand things like MVCC, you know, the some of the core kind of concepts that are out there, I think are, you know, part and parcel how to do this. Luckily enough a lot of this stuff is open source, which is just awesome, right? Like and so you want to go get a PhD and how to actually implement Raft go check out ETCD Raft, right? Like, go check out the implementation. There’s some amazing people working on that and including parts of our team. But I think there’s a lot of examples for people to actually go out and figure out how to do that. Because you know what you’re right Kostas, everything fails, everything fails. And if you don’t understand your own, you know, your own mortality. Everything fails, y’all. So you know, and regions do go out. And you know what, backhoes hit cables every day, and Google has failures of regions go out and Gmail goes down. These things happen. It’s about basically, dealing with it. Like the concept of an SRE is brilliant, you know, talking about RPO and RTO. And understanding what those things are. As a developer, I found it to be extremely important to get because I think you’ll start looking at it in a different way.
Kostas Pardalis 37:14
Yeah, absolutely. I totally agree. And, as you mentioned, what should an SRE expect by working with CockroachDB.
Jim Walker 37:23
I guess they could sit around and eat bonbons and let the thing run all day and work on other things. You know, it’s funny, like, you know, I think I typically think of SREs a lot of times in the concept of Kubernetes. Right, I’ve been kind of in the Kubernetes community for a while, and I just love being a part of it. And you know, from from that point of view, well, you got a database that’s already fit for this kind of world, you know, it’s aligned with their objectives, right, it is built for easing scale, you know, spin up a node pointed out the cluster, great, all the data balances, you don’t have to really deal with those sorts of things. And we can do rolling upgrades, you know, there’s online schema changes, right, the whole nature of a distributed system allows you to do some really cool things. So it’s kind of a low touch database for the SRE, in many different ways. But it’s aligned with the way that they’re moving forward with adoption of orchestration systems. You know, this is something that was built for Kubernetes or Nomad, for that matter. And so you know, for you know, for us, our conversations to the SRE typically run that. Now we employ, oh, gosh, I know we have a little small little army of SREs who’s managing and dealing with Cockroach Cloud right now our own managed service. And, you know, they’ll tell you, I’m sure they’ll laugh at this part of the conversation. It’s like, What are you talking about? Man? It’s, uh, yeah, there was a lot of work we do, they do. And it’s built to be automated. And I think that’s the that’s the whole key there. Right, that that’s what that concept is all about.
Kostas Pardalis 38:42
Yeah, and the last thing that an engineering team needs is to have the SREs unhappy. So I think it’s quite important to keep them happy.
Jim Walker 38:49
It’s kind of like … remember when there was like, unhappy DBA … Oh, I’m sorry, they were just unhappy all the time, mostly.
Kostas Pardalis 38:54
Yeah I know what you’re talking about. Alright. We’re getting close to the end of this amazing and very exciting conversation that we have. I have two more questions. One is about open source and databases. And I think anyone who has worked with databases, especially in the past couple of years, we see that pretty much having an open source version of the database is mandatory. It’s up there. How important is open source for building a database system in your experience?
Jim Walker 39:26
Yeah, I mean, it’s a great question. Let me ask you a question back Kostas, it’s like, what’s important for you with something being open source?
Kostas Pardalis 39:39
That’s a very good question. And I think it has many dimensions, but I would say that one of the most important things around open source is support, like I feel that like the project is alive and there are people there to take care of it, especially as the project is complex, right? So for me as someone who would try to architect or engineer a system, that’s important, especially for the backbone of my system, which is databases, right?
Jim Walker 40:08
That’s right. And I and I agree, it’s it’s community, right. And it’s and it’s, it’s about building people who are all kind of like, into it, and using this and seeing the code base, you know, move on, right. And so I always think about the, there’s like, there’s code, and then there’s like, the community side of things. And so code’s got to be open source, right? Community’s got to be there. The funny thing is, when, you know, I think we get into these weird conversations about the commercialization of open source, and we confuse the business model with the open source project. Because ultimately, like, look, man, I’ve been an open source for a long time. And the beauty of open source to me is consumption, like it’s free, I could go use it. And, and I have all this community of people to support, right? And like, that’s that, you know, and so, the problem is, is over the past three years, what’s changed is consumption, like, how do you consume software today, y’all? Like you go and spin up a service in some public cloud provider. And so the consumption factor has changed. Now, what we’ve done is we’ve taken free beer away from open source, basically, that’s it, we always talk about free beer or free puppy, right in the context of open source. And, so how do we get that back? You know, how do we get it so that everybody can use that tool, but still consume it as a service? And we’re hell-bent in making sure that we do that. We build up a community of people that are around us, you know, we changed our license about a year and a half ago, two years ago, some to the BSL business source license. And basically what it says it says, look at, what we want to protect ourselves from is from a large club, a public health provider, taking our codebase and going and making a bunch of money off it, right, it’s like Elastic, okay, like in Mongo the same like, right down the board. We’ve all changed our licenses, because database technology is a little bit different than other open source technologies. It’s complex. Remember, I talked about this seven, eight years to get it to a point where it’s even kind of like, you know, I mean, you guys Postgres has been around since ’96 it was official, I mean, it really started like, ’88, something like that, like, these things they aren’t simple. And so you know, we want to build a business, we want to build something that’s going to be there in the right database for all consumers, not just every single developer, but every large enterprise too. And for us, you know, it’s a balance, it’s a delicate balance of doing those things, and doing it the right way. And I think if you build a good, honest, humble, and and kind of, you know, open community, and and, and are authentic, you can do that. And that’s what we’re about. So I always think of open source as code community and consumption. And then there’s this weird thing about license that everybody gets wrapped around the axle on.
Kostas Pardalis 42:51
I think that’s one of the best descriptions I have heard about open source and how it interacts like with a business. Cool. So Jim, last question for me, and then I’ll let Eric, what’s next for CockroachDB? What’s the product roadmap that you have? And what exciting new things you’re going to deliver in the next couple of months?
Jim Walker 43:12
Yeah, you know, the ultimate vision of this company is, you know, it’s funny when I first got here it’s like, make data easy. Well, we actually do want to make data easy, and I kind of really ruffled with it when I first got to Cockroach and I first met Spencer, about four years ago, actually. But we’re doing things that take these complex concepts and make it really simple, like, like deploying a database across multiple regions. Like, you know, where does, where does data get located? IYou don’t need a PhD like, like, okay, you can kind of do that you can do this in say, Cassandra, but you need a PhD. How do you make it so dead simple, it’s declarative, right? Like, it’s simple SQL statements, and the database just takes care of this complexity. You know, we spend a lot of time doing those things, taking the complex distributed concepts and making them simple. But we also understand that consumption is a big deal. And so you know, our ultimate vision, our ultimate vision is that Cockroach Databases is SQL API in the cloud, you know, I want to make data available to every single developer on the planet, no matter where they deploy their application. And I want them to just communicate via SQL to some via you know, some REST interface or whatever it is, into the cloud and let us get let us deal with scale, let us deal with resilience, you know, let us deal with locating data so that you’re going to be guaranteed say, you know, so I don’t want to put a number out there, but it’s sub 50 millisecond, you know, access to data, no matter where your users are on a planet, right?
Jim Walker 44:35
And so, you know, for us, it’s how you do that, and well you deliver it through, you know, kind of the, you know, the whole kind of move towards serverless. So, how do you build this truly serverless database, you know, make it multi tenant, you know, be able to spin up and spin down dormant clusters, so we don’t get killed on cost, right, like make it consumption based, you know, all the security controls that have to be in place. And so, you know, for Cockroach, we’re pushing really far at that, you know, we’ve launched a beta version of this Cockroach Cloud Free. The free beta is available on our website, people can start to play with it, you know, it’s limited to about five gig of storage and you know, it’s a single region for now. But, you know, it’s where we’re building and focusing a lot of our future because we really do believe consumption is via, you know, a, the cloud, b, more importantly, I think people just want to, I think people just want an API, they just let all that complexity just melt into the background. I don’t want to have to think about scale. Just give me a bill. Right, like, and truly delivering on that promise, I think that’s where we’re headed. And I’m, I’ve never been more excited about a company because I think this vision is right. And we’ll see how it plays out over the next couple of years.
Kostas Pardalis 45:41
Yeah, absolutely. And we will be watching closely. Alright, so, Eric, it’s your turn.
Eric Dodds 45:48
The burning question that I mentioned at the beginning. And this is just more out of curiosity, we love to, we love to give our audience a little bit of insight just into the people behind you know, these companies and technologies and stuff. And I’m interested to know, coming from a programming background and now working in marketing, what principles have you brought from programming into marketing? In your role? And how has that background helped you frame the way you think about it?
Jim Walker 46:18
Oh, my God, my team’s gonna laugh: structure. Structure and framework. Always give people a framework to consume, break it down into three things, give it structure, because otherwise, you’re just all over the place. And so I’ll, I’ll often start questions like this and say, well, there’s three reasons, Eric, one, two, and I don’t even know what the third one is, by the, you know, while I’m talking about the second one, so I just I, you know, just having frameworks for people to actually understand things is just really critical. And I think in all aspects of our life, I mean, you know, if I’m going to write a paper, well, there’s a heavy outline done before we do that, so that we’re all in agreement, you know, we’re 30% done, right? That way, we’re all directionally correct. And having those concepts definitely apply in marketing, for sure. Because, I mean, ultimately, you know, what’s your, you know, God, we used to do PRDs. And, you know, you know, these deeper technical kinds of concepts, conceptual diagrams. So we were all aligned, right. And so that, that, that core concept has been fundamental, a game changer for me in my career in product marketing, for sure.
Eric Dodds 47:23
Very cool. And one last question. And this is, you know, we don’t know a ton about, we didn’t discuss a lot about how the, how the org is structured, at Cockroach, but one thing that I think would be helpful for our listeners, especially with your unique background, is a lot of our listeners, you know, are engineers working with data in some way or in some engineering capacity. And a lot of them interact with marketing teams in various ways. And those relationships are all over the place, we’ve had interesting, just interesting discussions with people about how they interact with marketing. And would just love your thoughts on what does a really good relationship there look like, in terms of sort of the people working with data from an engineering standpoint within a company, and then how the relationship with marketing works?
Jim Walker 48:14
I’m very fortunate to work in a company that works the way that Cockroach Labs works, you know, we there’s, there’s a level of respect across all the functions in this company that I find to be truly unique. In all the places I’ve been, and I use the term respect, because it’s actually pretty important. I don’t think a lot of organizations actually understand what’s beneath the iceberg when it comes to marketing and how complex it is, and how difficult it is. People think it’s a website, why are you writing it like that? Why are you doing this? Like, there’s so much that goes into it? We work as hard as anybody else in the organization. And honestly, I think, don’t get upset at marketing because they’re doing something wrong, help them get it right.
Jim Walker 49:01
I use this word with with our sales and marketing teams all the time, I use the word authentic, all the time, Authenticity, you got to be authentic, like I don’t go into a company and not understand what an AZ is, when you’re talking about, you know, you don’t say it’s Arizona, it’s an availability zone. Do you know what that means? You know, you don’t have to physically look at data centers. Help them understand what it is. Make sure they get these concepts, right. Because the more authentic marketers can be, the better off everybody’s going to be, the better off we’re going to be able to translate and sell what you’re trying to do. So don’t be against them. Help them. I think there is one of those things and I think that’s where the best relationships we have across our marketing team is there. I mean, look, we’re selling to developers, you know, I love talking to my development team. It could be little bit more software engineer sometimes are a little bit you know, out there sometimes but you know, we learn a lot from them too and so that that that respectful communication back and forth and the the you know, having the patience to, look man the you know, you may think something’s wrong in the website. There’s a whole lot of other stuff going on y’all, right? But if something’s wrong, call it out too. And do it respectfully. I think that that’s the thing. It’s not a simple job.
Eric Dodds 50:09
Love it. And, and I now feel fully guilty for earlier advocating the idea that we can blame everything on marketing.
Jim Walker 50:17
It’s okay, but you know what, call them out when they’re wrong to like, Hey, guys, like, but this is why you’re wrong. Like, don’t just say you’re wrong to say, this is why you’re wrong. And this is the effect that it’s gonna have. Right? Give it a reason, right? And so call it out. Gosh, by all means, you know, we’re, you know, we’re here to make it better. We are also after a goal too, right. And that’s, that’s critical.
Eric Dodds 50:37
Love it. The concept of respect in that relationship, and really all relationships is huge. And I think that’ll be really helpful for our listeners. Jim, it’s been a really wonderful show, is CockroachLabs.com the best place to check out all things Cockroach?
Jim Walker 50:53
Yeah, absolutely. You know, the free tier … of course we’re hiring, we’re always looking, we’re growing like crazy right now. Y’all like this is just a lot of fun. So yeah, everything’s there at CockroachLabs.com.
Eric Dodds 51:03
Very cool. Well, we’ll check back in with you in another six months or so and have you back on the show? And thank you again for your time and insights.
Jim Walker 51:11
Well, thanks for having me, guys. I really appreciate it.
Eric Dodds 51:13
Wow, another super interesting show. I think my big takeaway was hearing Jim talk about the sort of lagging migration of transactions to the distributed architecture. And that was just really interesting to hear about how difficult that problem is, and how sort of sort of optimizing existing systems wasn’t going to work in order to deliver the experience that, you know, sort of ultimately people are demanding. And I just, I just thought that was a really thought provoking answer to that question. How about you Kostas?
Kostas Pardalis 51:50
Yeah, absolutely. Eric, like, at the end, distributed systems are hard. They’re hard to build. And most importantly, it’s hard to reason about them. Like at the end, it’s you can end up in situations where you’re traveling time, right, can be completely mind bending. So there is a reason that it took a while to see all these technologies becoming more and more approachable out there. I think, though, that probably the most important outcome from our conversation with Jim today was about the needs for the engineers to change their perception and start thinking more in terms of distributed systems and computing. And that this is going into something that’s going to become more and more important in the future. Not necessarily in the way that with like, okay, anyone, everyone has to understand how Raft or Paxos works, but more about understanding the differences and like the challenges and also the advantages of using distributed systems and how these affect your product, your architecture, and the overall way of like thinking like in engineering terms. I think that’s super important. And it’s what makes marketing in this company really important. And I think that’s a testament of like, that talking with Jim today, is marketing can really be an educational tool to help all these engineers out there, figure out the right things to understand and the right concepts from distributed systems to use in their everyday work.
Eric Dodds 53:24
Yeah, I agree. It was interesting. I almost asked the question that we asked a lot of our guests, which is, what are some other ways that people are solving this problem today? And the more I thought about that question, I ended up not asking it because we were talking about a shift in the way that you think about architecting a system and so I just appreciated his perspective on the mindset shift that’s required. Well, thank you again for joining us on The Data Stack Show. Be sure to hit subscribe on your favorite podcast provider, so you can get notified of new shows every week. And we have a great lineup in the next couple of weeks. You’ll want to be sure to grab those episodes and until next time, we’ll catch you later.
Eric Dodds 54:08
The Data Stack Show is brought to you by RudderStack, the complete customer data pipeline solution, learn more at RudderStack.com.