This week on The Data Stack Show, Eric and Kostas chatted with Joran Greef, Founder & CEO of Tiger Beetle. During the episode, Joran discusses his journey from accounting to coding, why double-entry accounting is important for databases, safety issues in financial softwares, the need for low latency and high throughput, and more.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 00:03
Welcome to The Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You’ll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by RudderStack, the CDP for developers. You can learn more at RudderStack.com. Welcome to The Data Stack Show today we are talking with Joran from Tiger Beetle, which is an absolutely fascinating technology. Kostas have so many questions. It’s a database that is built for extremely fast double entry accounting, which we’ve been talking about a lot recently. And so to have such a specific use case is super exciting. And this sounds so simple, but I want a refresher on double entry accounting, because it’s been a long time since I’ve taken accounting in school. And I think revisiting the fundamental principles will help me and hopefully our listeners understand why Tiger Beetle needed to be built to solve specific problems around that, as opposed to using, you know, any number of existing database technologies. So that’s what I want to ask.
Kostas Pardalis 01:27
Yeah. Again, first of all, I’m very excited that we have this conversation with your auntie decK DAG of Bulldozer database system that has managed to create a lot of interesting noise lately, not necessarily about the problem and the use case that they are going after, which we are doing like to talk about, but also because of the very unique approach that they have with technology, how obsessed they are, with things like safety. And performance. Yep. And asked, like people will hear from urine, like they will see, like, a very unique perspective on approaching problems from a very engineering perspective. Yeah. Talking about type systems, when describing, for example, wild like a double entry system. Nice. Right. So there are going to be, it’s going to be a very interesting conversation, and we’re going to talk about a little thing, so people should definitely check on this one.
Eric Dodds 02:49
Yeah. Are you going to have enough time to ask all your questions about distributed systems,
Kostas Pardalis 02:55
though? Well, definitely. Hopefully, we will have him back. Okay, well,
Eric Dodds 03:00
Let’s stop wasting time. And again, you’re on Welcome to the datasets show. We’re so excited to chat.
Joran Greef 03:06
Oh, thanks, Eric, cost us. Such a huge privilege to be with you.
Eric Dodds 03:12
So excited. Well, give us your background. How you actually didn’t start with data, which is really interesting. So take us way back to the beginning. Well, I guess data in a form but
Joran Greef 03:21
yeah, so I started with double entry data on paper bank. General Ledger’s T accounts. You weren’t allowed to use a pencil. Yeah, t’s pen had to be new ink. And I remember reading my university exams with my major in accounting, financial accounting. And I always wanted to get into startups and get into business. And I understood that eight from people that, you know, accounting was a great way to see the world to travel the world of business and see, you know, all kinds of industries and sectors and just, it’s the way that any kind of business can be represented. It’s the schema for business. So I got excited about accounting. And yeah, so I’m kind of old school data, if the computer is off.
Eric Dodds 04:14
Yeah, I love it. The schema for business? What a wonderful concept. Um, you know, for the financial component. Okay. So from accounting, did you know what your entry into startups was? And especially software, right, because you, I mean, obviously, you’re building some unbelievable software. So, how did you make the jump?
So, yeah, so
Kostas Pardalis 04:37
I’ve always been coding.
Joran Greef 04:40
Since I was a boy out, you know, I went to the library to pick up books on basic and, and cat and kind of then shelve, that for a little while and around university days, I got back into coding as well. And what I loved about coding was just that he could, you could, it’s like you’re doing engineering but you could build things. And you’re like a movie Inception, you can build these incredible creations and you don’t have to pay for raw materials. So if you don’t have a lot of money, you can build something credible, there is no limit to what you can build. My father is an architect in the real world. And I kind of saw that with software. You could be an architect in the invisible world and are always loved. You know, I love music. I love things that are invisible. You music stories coding? Yeah, so I kind of was doing accounting, always wanting to do a startup. And the fullest coding was pulling me back. And yeah, so I guess my final year at university was, I had two majors, it was accounting and coding on the side.
Eric Dodds 05:55
Very cool. And let’s start talking about tiger beetles a little bit. So where was the tiger beetle? First of all, tell us what a tiger beetle is? And then rewind to the beginning and tell us where it came from? Because it’s a, you know, sort of a very specific tool in many ways.
Joran Greef 06:13
Eric Dodds 10:33
yeah, absolutely. What one question on the speed side of things, you know, I’m just interested in, you know, in terms of your motivation, how much of your motivation is trying to, you know, build things that run really fast, right, or write code that runs as fast as possible? How much of that was just your own personal desire to see how fast you could make it versus, you know, tackling problems that required speed?
Joran Greef 11:07
I think it was both. So, because of you, I started to realize that if, you know, when I was coding in Ruby, if you had something that took a second that really impacted people, you know, they had to wait. I also saw at the time, because I was in Cape Town, I had an interesting learning experience, because maybe 10 years ago, latencies here were not great. So if you could really think about speed, you could have a much better user experience, you know, if you were writing a single page web app, and that had to run in the Wineland, somewhere, you know, when they had the effectively still on dial up. So, you had that firsthand experience of nobody wanting software to go slower, like they weren’t going faster and faster. But you can actually do things that make a real difference to people. But I guess I also just like to go fast.
Eric Dodds 12:02
Yeah, yeah, I love it. I mean, it’s always fascinating to me, you know, sort of where discoveries are made. Because many times, you know, discoveries are made in response to a problem. And it’s a novel, you know, response to a problem, but also achieving something for the sake of achieving it, because you enjoy it. I think it’s really interesting when discoveries are made that way as well. Well, I would love to dig into the technical details of tiger beetles, I’m gonna let Kostas do that. And I think for our users, one thing that would be really helpful is, you know, maybe just a quick summary of, you know, what is double entry accounting? And why is that important? You know, for a database, just for our listeners who like me, who, you know, maybe they took an accounting class a really long time ago, but I think understanding, you know, just the basic nature of double entry accounting will, will give us all a really good foundation for understanding some of the technical details and decisions.
Joran Greef 13:05
Okay, sure. So I’ll do my best. If I remember my lesson correctly, you know, when I don’t know what I don’t know. Sure, but so is double entry accounting. I think maybe this would be the developers guide, double entry accounting curve.
Eric Dodds 13:22
That’s exactly what we want.
Joran Greef 13:25
And I would say, maybe the way to think of it is like Newton’s third law. So for every, you know, for every action, there’s an equal and opposite reaction. Or laws, thermodynamics, energy cannot be created, nor destroyed, but many changes from one form of energy to another. I hope I got those laws right. But basically, that’s what double entry is. So money cannot be created, nor destroyed. But it mainly moves around from one party to another. That’s very important, because if money can be created or destroyed, well, there’s a problem because, you know, greenback is being lost. And that’s illegal. Yeah. So, so they, if you look at the system, money is moving between entities. It shouldn’t just disappear or fight through, you know, sold through the gauntlet from waves. And then it’s, yeah, for every action has an equal opposite reaction. So if I give you $100, you know, I shouldn’t use that, and you should receive it. It needs to go somewhere. And there should be a paper trail and audit, you know, log that it did or didn’t. So basically, double entry is a way of not only counting sins, because you can use it to count anything. It doesn’t have to be money. But what’s interesting is, you know, we might take like a 64 bit integer counter, and just count, you know, you could have a counter and I’ve got a counter. This isn’t a double entry, by the way, but this is maybe how we might do this, develop it so. So if I send you money, your counter goes up, and my counter goes down, that isn’t double entry, because you don’t really have an audit log of why your counter went up, and why mine went down. And if you make mistakes, it’s possible that someone might make your counter go up twice, but mine only goes down once or three times. And then there’s an error in the system. And how do you detect that like money is encountered twice, you know, it, you’ve lost this principle of an equal but opposite action. So really, I think double entry is kind of like a vector. It’s not just a count, not just speed. It’s a vector where money is moving. But it’s also moving from somewhere, to somewhere. So it’s that relation of this is an entity that has received money, which in accounting terms, you would say it’s been debited, or credited. And this is an entity that has transferred away, which could be a debit or credit. And I think this is also something while we’re at it, developers often want to simplify accounting che, well, let’s just use a negative balance or a positive balance, why do we need debits and credits? And the answer is that double entry is almost like a type system. So you have different types of accounts, an account would be a person or an entity, yourself, Eric and I will have an account. So I can debit my account, or x units and credit yours. Now you’ve got equal and opposite, you’ve got double entry, entry in one account, entry in another, you’ve got that vector. But I think the other thing that we need to understand with double entry is that it should touch the system. So you get different types of accounts. Because in the real world, you’ve got different types of transactions, different types of money. So for example, you can have a loan or a debt. And that is called, like a liability. In accounting, it’s a layer of liability account, it’s a type of account, you can also have an asset, which is like cash or bank account, or someone owes you money, that’s an asset, or it could be a Tesla that you have on your books, it could be something intangible like a brand. Those are all different kinds of assets, which is another type of account. If we see accounting as a type system, so you’ve got a liability, you’ve got an acid there, those are sort of mirror images of each other or Turner, you know, so if I owe them if the bank owes me right, then I have an asset. And the bank has a liability. And then another type of account is equity, which is, you know, if you own a piece of a business, and that equity balance of those two out so you’ve got assets minus liabilities, and that’s equity, Bash now what the business is. But then you have to ask, Well, how do Assets increase? How do liabilities increase? And the answer is, well, there’s income and there’s expense, incomes and expenses. Those are also two different types of accounts. So now we’ve got five, income expense, assets, liabilities, and equity. And if you increase the way it works with this type system, I’m coming into Lambda. But if you want a positive balance, it’s just the way it is right? If you have an asset account, a positive balance will be a debit balance. So think of it as a T account. An account has two counters that increase basically, each side of the T account is an append only log, it’s immutable. You’ve always You never change anything, because that would violate the law of you know, of money being conserved. But you can always add things to see add transactions to either side of the debit or credit of an account. And if you sum up the to the debit column, and you sum up the credits and subtract them, you ‘re going to end up with a net debit balance or net credit. And an asset account as part of the type system is always decreasing with a debit balance. Liabilities increase with a credit balance. You can then work up what equity accounts do because of the neurosis and then E comm accounts and expense accounts are kind of similar. So, yeah, so expense accounts increase with debit, and income accounts with credit. And then that’s the type system. So that’s why it’s kind of bad practice not just to not just use negative numbers, because accounting, you never subtract, you always add a new you that you have that way you don’t lose information. So it’s Yeah, I don’t know if that helps anybody out there. Give it
Eric Dodds 20:35
That was an incredible double entry accounting 101. For developers, I have to say you’re on that. That was amazing. I actually wish I could go back and take my accounting. That’s University. Okay. I lied about how many more questions? So one more question, this will be a great lead into the technical stuff. So very helpful, I think, is building a foundation for why we would need specific functionality on a day to day basis, or a system that has those types of requirements and, you know, additional related requirements. Why does it need to be fast?
Joran Greef 21:13
Yeah, so that’s a great question, Mike. It’s not just Top Gun average. And I feel the need for speed on the t-shirt.
Kostas Pardalis 21:23
Although it’s fun, yeah. But
Joran Greef 21:26
I think it needs to be fast, because we kind of want to get back to this world where? Well, I think what’s happened is odd. We got so fast. And software sought? Well, we’ll be okay. Hardware is getting fast. Now, we’re in the world where software’s really, really slow, like hardware so fast and software so slow, that we’re kind of we’ve gone back backwards, you know, I don’t know, the moon landing, you know. And we think, well, we used to be able to do this stuff, why can’t we anymore? Like, why is everything so slow? You know, engineers, I guess I just feel like, well, you know, if this is one corner of the world where we work, which is a database to track financial transactions? Well, at least it should be fast, because we don’t want people to wait. But more More concretely, the reason why it should be fast is because there’s a big problem out there is that the world is becoming more transactional, more financial transactions or business events, you know, one business event leads to many double entry journal entries. So you can have a business that only does 10 ride sharing business events per second, that can translate into maybe 100, or maybe 1000, double entry journal entries a second. And, you know, so the visa network will do a few 1000 business events, but if you work up to journal entries for that, and all the fees, and all the partners, others, so like everybody else, you know, it adds up. And but that’s the old world of FinTech, you know, and the New World, like on the internet, and the scale of things is just increasing, the world’s becoming more transactional. So we saw it in the clouds, you know, you had you used to pay for a server once a month, or ADFS used to buy them, you know, then you pay for their monthly then you paid for them hourly, then you pay per minute, now you pay per second, now your pay per function, millisecond, it’s becoming more and more transactional. So, there’s a real problem in that the existing databases that people use to track us, they can’t keep up. We reaching a point where they just there’s too much row lock contention, you know, every time you have to debit an account with credit, and other often it’s a small number of accounts, you know, like your fear count, that becomes a hot account, and then rowlocks serialize everything and it’s a real problem. Everybody we chatty in FinTech has this problem. So if you can go sauce, you can actually just solve that problem for people, you know, the Black Friday issue. And also, it means that, you know, there might be paying a few 1000s of dollars a month for cloud hardware, if he gets fast, you can change that, you know, cut that. Obviously, that’s not always important. But for some people, it’s very important because that translates into cheaper processing fees for payments. So there’s work going on in some parts of the world to love it, you know, billions of people above the critical poverty line. And one way to do that is just give them access to FinTech that has cheaper payment processing fees, which, you know, a database by tiger beetles, that was actually where target beetle came out. We were analyzing open source payments which had had these problems that they needed more cost All loans more cost efficiency. But I guess the final reason is that, you know, for example, in India, if I understand correctly, there’s the famous IBM s switch de does, I think, on the order of 10,000 transactions a second, it runs. So I thought, the only way to keep that system running is I think Redis. In memory, you know, so account balances are all volatile memory. And that’s fine, because they’ve designed a system that if they lose the Redis mode, they can, you know, restore the data from banking partners. But that’s where you go and say, Well, what if we could do 100 times more performance? Because if you can do that, well, maybe we can start to use stable storage and dust, and then they don’t have that problem anymore. So your actual performance buys you a better operator experience? It’s just nice. You know, you don’t have to worry, you know, Black Friday, you can trade performance for safety in that awkward experience.
Kostas Pardalis 26:07
I have a question about performance. And it’s like a question that I get many times from engineers, actually, when I have a conversation with them about performance, and I’m saying we’re golf. Oh, I tried these databases and it was really fast, right? And then they’re like, Oh, are you talking about throughput or latency? So my question is, I see, like, the numbers that tiger people got out to him, they are way, amazing, right? Like they are huge numbers there. But what is important, like, more important in these systems like these financial transaction systems, right? Is it latency that matters the most? Or is throughput that matters the most? And is there a trade off between or you can have both.
Joran Greef 26:59
So I think that’s what makes a double entry or, or a ledger database, that’s what makes it so hard. Because both are important, you need low latency and you need high throughput, you need high throughput, because the world is becoming more transactional, you know, there’s just more and more volume. And that unlocks the use cases. But you also need low latency, because often, these databases are tracking business critical events. So they’re, and remember, again, you know, for one business event, there might be 10 to 20 journal entries. So if one journal entry takes 100 milliseconds, because of contention, and rowlocks. Now, that business event is taking out an error of one second or two seconds, and then that’s a problem, because now, you know, ordering a cab is going to take meat, one second. It shouldn’t be, you know, we need to get back, it needs to be fast. So there’s real business pressure on latency as well. There’s business pressure on throughput and latency. Also, because again, often, you know, for example, in fintech, they deal with these nightly batches that arise. If you don’t have enough, ingest throughput, sometimes the systems they don’t have the data gets delivered, it might in the morning, when they’re supposed to be open for trading. Again, they still haven’t finished the nightly import, you know, and that’s why we really want fast ingestion, because it can save you like that. But you’ve kind of got nowhere to hide, you need low latency, average IQ points.
Kostas Pardalis 28:41
Okay, we’ll get more into this later on how it can be achieved, because I think there are probably many different things that need to happen in order to improve both of them. But before we go there, let’s talk about Tiger Beetle. Other database systems, it’s so tiger beetle is very, almost like laser focused in one use case, right? It’s almost dating, let’s say something that someone would build with a schema over a relational database. And putting, like, all the logic around, like the schema inside the database internals itself, right. So we have the T purpose builds database. Why is this needies why do we need something so laser focused on the use case itself? And we, we cannot let’s say just keep scaling Postgres or I don’t know, some other database system ClickHouse to do the work that we do with Tiger middle.
Joran Greef 29:51
Yeah, thanks, Kostas. Great question. So, I think firstly, the domain is so valuable. So it is a really valuable form of data. So it’s the kind of domain where you don’t want to use Postgres. You don’t want to use Redis. Because you can’t afford a single node system. So you, you need your ability, you need high availability, you need replication. But the replication really needs to be part of the database and should n’t be an add-on, you know. And the other thing is, I mean, you want open source. So obviously, you can get high availability in the cloud. But why can’t we have this as open source for our databases? Why can’t we have strict serializability? Why can’t we have automated leader elections? These kinds of things shouldn’t be in an open source database, I think and kind of for the domain, you need that. You know, listening to Martin Thompson just convinced me that it’s no, you know, you these days, you, you for a ledger, you need to have gist always on mission critical, eaten. And that’s kind of part of providing a great developer experience, that you can give people a single binary, they can spin up a cluster, and the terminus just runs. And that was what we wanted to do. So that was why we didn’t do a Postgres extension. We didn’t do Redis. I mean, those would have been options, you know. But we wanted to, I guess, what we realized is that he has an interesting domain. So we had a real problem, looking at a real pen switch, I was working for
Kostas Pardalis 31:33
coil, coil Ara,
Joran Greef 31:37
a startup in San Francisco, and a lot of payments experience, have seen these systems in you know, again, and again, and we kind of just saw, well, everybody is reinventing, you know, they are reinventing a ledger database. And eventually, it just got to the quality Michigan, well, let’s go do something about it. But the key Allah ledger is generous, because everybody is taking SQL, and then 10,000 lines of code, and eventually, they’ve got a ledger database, but they don’t know it. And we thought, well, let’s budget properly. And because of the left side of things, you know, you can take another database that isn’t meant for the domain. And you can make it into a double entry. The problem is, you start with the raw materials, throughput and latency. But when you look at what you’ve actually ended up with, you don’t get your finished product, you get much less, you get about 1000 transactions a second, you know, and your latency is maybe lumpy, you’ve got, you know, data risk. So we thought, well, because we’ve actually got such a simple tacky focus domain of double entry. Let’s go then deep on the technology and deliver a great experience in a single binary
Kostas Pardalis 32:51
event. And why did you think that’s because okay, like, double ender is not like the new concept, obviously. As you said, when you were at college, you had to use BEAM? Everything Well, yeah, not right. Yeah. It’s been around for a while, then. Okay. The Judith is that in, in decK, like finance and accounting, like a driving force for innovation, right. So why 2022 is the time that we can afford such a specialized piece of software, that is not just the database, but is a database for a very specific data model. Right. And why would I want to do that like before?
Joran Greef 33:39
Yeah, so I think I, I love that Henry Ford quote, you know, you can have any car so long as it’s painted black. And I think for a long time, we’ve had, maybe, you know, we’ve, we’ve been in the situation where you can use any database. And there can be as many databases in the world as you like, so long as they are Postgres or MySQL. And I think it’s because the world could only afford, like that many databases, because they’re so hard to build, or they used to be, you know, they take years and years of how many people, you know, has put into those systems to get them to where they are today, because they’re incredible systems, they took 30 years. But I cannot believe that we’re at the point like since 2018, where there’s been like five big things that have changed. And that means that it’s not just tied up. All right, I think we’re gonna see an explosion of new purpose built databases, because databases are like car engines. If you have a great engine, you just go really fast, you know, if you want a great solution, if you want to simplify the application layer, just have a great database that is good for the domain. So for example, if you want to store lots of user photos, have a great deal of As like S3, that’s the right tool for the job, you know, don’t use Postgres or your gloves. You know, if you want to create a queueing system, use Kafka or read Kenda even better, because that’s the right tool for the job. And then when we looked at double entry, well, we were still in the old world where we were asking well, with the double entry database, could you I mean, there’s lots of Ledger’s, but there’s nothing that there’s like a database that is really high performance, because that’s what databases are, you know,
Kostas Pardalis 35:33
they give you
Joran Greef 35:35
all the invariants of the domain are protected by the diversity, get the invariants enforced, you know, like data consistency, pain, isolation, transactions, all these great things, the database has solved the whole problems for you, and they give you performance. So we looked at mid Ledger’s, and saw it didn’t seem to be there yet. But I do think we’re kind of, you know, we can go into these five reasons why we’re going to see that, you know, the world is ready for more kinds of databases, because they’re just a great way to solve lots of problems.
Kostas Pardalis 36:12
And so today, like before, or what people are using for implementing, like a double entry system, or ledger.
Joran Greef 36:26
So from what we saw, and we looked around a lot, it’s typically a SQL database, then they wrap 10,000 lines of ledger code to create a double entry. The reason is that it’s a very deceptive problem. And what we hear from people is you don’t get it right the first time. So it sounds simple. But to do it well is really hard. Because you saw the latency throughput you solve strict serializability you solve the consensus, there’s so many hard problems. And on top of that, you kind of end up with a low again, you know, low throughput, typically, it’s sequel, or people will reach for Redis that you’re taking shortcuts will save them or they’re using Cloud databases, and they’re then paying a fortune. Yeah. And, what it really means is that new things like digital FinTech, you know, use cases, you get these interesting things, for example, at coil, we were focused on the open inter ledger protocol, which is a way it’s like Ethernet. But for the payments world to emerge means to connect all the payments networks, the old networks, the new networks, banks, net mobile networks, that everybody sent maleate. Like you can send them an email or send a data packet. Those kinds of applications need a high performance database, brilliant, you can’t build these future, you know, future application was on the old, you know, old system of SQL plus 10,000 Landed code because you said that the again and you just hit that problem of rowlocks we contention you need a different design to enlarge the use cases.
Kostas Pardalis 38:18
Yeah, it makes a little sense. And okay. I think everyone’s like Marketo aware of what are the primitives for interacting, like with the SQL database? Model there. And the algebra like tables, the tides, bah, bah, bah, like all that stuff. So in the system, like, tiger beetle, what are the primitives that the user is interacting with?
Joran Greef 38:40
Yeah, so I think, like any bubbler or or, you know, Justin Jaffray would say, well, be very careful if you go to your database, and the query language is not SQL. Be very careful here. So we were very careful. And we figured, well, if you ever wanted a query language or database that wasn’t SQL, double egg tree is a good schema to have because it’s, you know, it’s tried and tested 500 years old or more, there’s way more 1000 1000 years old, more human. It’ll probably be around after shiko we’ll still have double entry, you know, that I that makes you sleep well at night, literally. So yeah, so what is the best time you know, it’s very simple. You have accounts or these T accounts, debit and credit balance sheets, and you have transfers between accounts or journal entries. So you have two data types, integrated accounts and transfers between accounts, where transfers convicted are a debit this account credit this account, this is the amount this is the time to target isn’t SQL, it’s double entry accounting accounts. And Charles is very, it doesn’t give you these nice primitives out of the box, and it’s kind of what you want. Because of financial data, you don’t want to mix it up with your SQL data. You don’t want to put it in a general purpose database, because often you have very different compliance concerns for financial data. So you want separation of concerns. Same reason you want S3 object storage. It’s different data, different performance characteristics. Retention carried is all different. So that’s the interest for the target leader.
Kostas Pardalis 40:29
Yeah, it makes a lot of sense. Okay, so let’s talk a little bit more about the technology. Now. You mentioned, like some things that you’re very interested in personally, and those are the foundational, let’s say, parts of the Tiger bibble design, one of them is safety, eggs. And when I was going through the mutational Tiger brutal, I saw that you’ve done a lot of work on ensuring that you take care of external faults. And I always started faults like that, you know, we felt the operating system for that stuff like the fog system, like what are you talking about? I thought, what is something that is just committed on this, you know, that I can prostate. So in case of like, in distributed systems, we’ll talk more about the network. And like the splits that might happen through that, like the most common topic of discussion when it comes like the photo neurons and availability show. Tell me more about the store that falls aliens, how important they are in how common are also
Joran Greef 41:38
be so that this is kind of coming out of like, just my love of storage systems before target evil. And what was interesting is that there’s a lot of getting into how we can have new databases today? And the reason is kind of, well, we have to, because the data, the existing databases are so tried and tested that, on the one hand, okay, well, they’re pretty reliable. But on the other hand, we know exactly where they broke, like where they have latent correctness, bugs. So if you’re building a new database today, there’s a lot of research on the issues, you know, where you can lose data in Postgres, there was if St. Gating 2018. It took, I think, at least two years for Postgres to fix that to switch to direct IO, you know, because the Linux kernel page cache is not trustworthy. If you ask the page cache for data, it can actually misrepresent what’s on disk, in the Linux kernel, you know, and the reason is, because disks are just culty. You know, they’re the real world. And the storage fork research out there is that disks do fail. And I’ve got ZFS to thank for this unit for opening my eyes. But yeah, just the way that they cared so much that a file system shouldn’t allow the trucks you know, cosmic rays conflicted, single dose, so many reasons that a disc can fail remotely. I just recently read about Hacker News. They lost two, I think two SSD simultaneously. Yeah. And it happens. And I’ve been running like a MySQL database, I’ve got corrupted because of a disk fault. I’ve had, I’ve replaced several disks in RAID arrays, where the disks went into read only mode because of sector failures, female owned. So you get all kinds of storage faults, you know, disk can corrupt on the read or write past they can just to build a firmware bug that will read from the role sector. It’s very rare. But the thing is, with probability theory, the more of these clusters you operate, if you operate 10 clusters, your risk has gone up by a factor of 10. Yep. So and I mean, even a single desk in a city two months period, as I think it’s on the order of St percent chance of Layton sector error. Allegiant Secretariat can wreak havoc, you know, with a gentleness running on top of it. So it’s kind of targeted to all we thought, well, there’s a lot of storage for research. A lot of that was coming from your university, Wisconsin, Madison Renzi and Andrea pesci so they also wrote a step book, which a lot of developers love. But basically, lots of good reasons, you know why databases need to start changing. On the distributed side like Paxos or raft, what was interesting too, is that if you want to write a set consensus protocol, you actually have to think of the disk. So there is this famous quote, organizing Lamport works, you know, he’s, he, he kind of said, you know, the standard replication was conjecture. He said, You know, if you don’t have a formal proof, it’s just conjecture, right? That was what he used. But what’s in Interesting is that Peck sauce? If you look at performance groups, the fault model is the network fault model? Yeah, but yet, X OS assumes stable storage. Yeah. And that begs the question, well, it gets stable storage, you know? Yeah. Where’s the disk fault model? Because for Paxos to be corrected relies on stable storage, otherwise, it’s not correct. And that’s what I always loved about the standard replication is, especially with a 12 revision was they just had it naturally, you know, the intuition that a consensus protocols should be able to levitate like a holographic it should run, it should be able to run only in memory if you wanted to. And I love that because it shows, you know, the disks do fail in the real world. And if you want to do a real consensus implementation, he has to think about checksums. And it’s a much harder rematch, you have to think about what the disk doesn’t do if it is sick. If that happens, you know, and all our Paxos and rafts, formal proofs just break down if you introduce that storage fault model. So again, Wisconsin Madison had this great paper called Particle away recovery for consensus based storage. And they said, well, even the way we design distributed databases must change. Because up until now, you had the global consensus protocol at Paxos raft. And it assumed, you know that you had stable storage, he had the local storage engine, that the two were never integrated. So they couldn’t query each other, who is kind of like you running a database on top of ZFS. But actually, if you want to do this correctly, for very high availability, you need to be able to ask service, did you have a storage fault, because if you couldn’t tell me, I can maybe help you out because I couldn’t use distributed recovery. Otherwise the NFS contract covering your cluster could get lost much sooner. So as you want to evolve very highly available databases, you need to start looking to new consensus protocols beyond Paxos are off or if he used them. You at least one and integrate them with your local storage that they work together with. And that type of show how to do it. So that was also 2018.
Kostas Pardalis 47:22
Yeah, yeah, I think we probably need a full episode just to talk about the store. Yeah. And I’ve been holding a rabbit. Yeah, maybe we should do that. But we’re running out of time, Tim and I have one last question. Before I give it back to Eric. You mentioned you stop reputation and punctual and harassed rights and Paxos and drafts are like the two products of the corporate mafia. Conscientious when coming back to distributed systems, why is it the time that it took for you some replication to start being implemented and people start talking more about it and using it and why we decided to go with that instead of Paxos or raft, right?
Joran Greef 48:13
Yeah. So it comes back to when I was following the works at Dropbox, we’re doing our magic pocket, there are three solutions, which was amazing. And James Carling was one of the engineers on that. And basically, Heidi Howard was speaking about Eastern replication, having better latency when it has to elect a new leader in comparison to raft. And the reason is that raft and Paxos have a random leader election algorithm. So you know, it’s a selfish algorithm. So if you think the leader is down the pose, your soul as a candidate, and this time replication 2012 edition, is sublime. Because what it doesn’t do that it’s the unselfish protocol. Let’s work together as a team, a leader election protocol. So what happens there instead, is that if you think the leader is down, then imagine all the replicas in array, just vote for the next replica. And this way, you can predict who the next replica will be in Paxos or raft if you don’t have that foreknowledge. So you can do things with your stamp replication. And yeah, it’s important for people to also understand that the raft is your stamp replication. It was influenced by Brian O’Keefe thesis, Branner key sort of pioneered consensus a year ahead of tax off in 88. And you’ll see it in the rough paper that they mentioned, you know, it’s most similar to BSR, but it essentially is VSR. The only difference is, you know the paper presentation. It has RPC. It also has a random election. But if you stamp replication you get better latency because the cluster is waiting together as a team. If the leader is down, they are almost always going to switch over to the next one in the ring. Whereas raft you have a problem of dueling leaders. So because everybody’s putting themselves forward, you could get this thing called Julik leader where you maybe have a split vote, then you have to redo the whole leader election, which is not great in production, because now you’re having, you’re adding, you know, 10s of milliseconds to hundreds. And some raft will mitigate that with, you know, random forest padding, so that you don’t need any of that induced amplification. So basically, long in the short is, raft is your stamp replication BSR is actually a better name for it because it is the original name. It’s true to computer science history. And the newer view change or leader election, I’ll ruin Columbia sovereign wealth is more advanced than what is in rostral Paxos I think, that’s why we picked it as the surprise prize is that, you know, James Carling doing all the storage work. He’s one of the authors of the 2012 vSRX paper. So Heidi Howard spoke about it, Martin Thompson was speaking about it. Mark Brooker was asking him a question, why is nobody using Eastern replication? You know? So we thought, Well, okay, we all are here as we have a contract.
Kostas Pardalis 51:35
So some yeah, there’s okay, there’s a lot more that we can talk about. But I wanted to actually respect the time here, and also the microphone to Eric. But we’ll record at least one more episode. There’s plenty of wisdom there, though. But I think everyone would be interested to hear from you. So varied, all yours.
Eric Dodds 51:59
Yeah, absolutely. Well, we’re close to the buzzer here. So just one more question. There’s so many fascinating underlying principles in, in what you’re doing at tiger beetle that, you know, are interesting, you know, outside of the world of double entry accounting. If you weren’t working on tiger beetles, what problem would you solve?
Joran Greef 52:25
I don’t know. I think, yeah, I just love that databases have, you know, so many cool things in them, they cover the whole array, it kind of took me a while to figure it out. Here and you get, you know, speeds, you get storage, she gets security, which is just another way of looking at safety. All the cool computer science things, but you can think of them from a mechanical sympathy interview. What I think is also special is, you know, there needs to be quite big changes, but some of the others are, you know, I always wanted to do systems, you know, and just write a single binary that runs everywhere without the JVM. And now you’ve got these great systems, languages, there’s rust, they’re sick. And they’re incredible. Now, you can do that. They’re much safer than C. There are, they kind of move memory safety issues. They have different approaches, but they moved that into a lesser order of magnitude of concerns, you know, so, so great, safer languages. Great time for databases you got, I always wanted to do like iron, you know, and everybody is struggling with, you know how to do async IO on Linux, it was such a pain. And now you’ve got IOU arrays, it’s perfect for databases, you know, good data database, thanks DNS Expo. And also safer languages, better iron. But finally, it’s just it’s there’s new way you can build these databases much quicker because now we’ve got deterministic simulation testing that foundation dB, and James and CJ and them at Dropbox pioneered you know, where you can now run these databases in a simulation, and tick time deterministically. It’s kind of like Jepsen, but on steroids, because Jepson it’s not deterministic. So you can, you can’t replay. If you find a bug, you know, maybe it takes you two years, you have to wait to find it, then you can’t replay these new deterministic simulation techniques. If your database is designed for this, then you can actually tick time in a while true loop. And you can simulate years of testing in just a day, you know, and then it gives you the confidence that hey, let’s build a new purpose built databases because we’ve got central languages, we’ve got better IO, we’ve got storage fraud research we’ve got all of that is behind us. It’s there. And we’ve got simulation testing, which is kind of like Fred Brooks, his silver bullet. You know , You can actually really speed up your development velocity and build these things much, much quicker. It gives you confidence. Yes, I don’t know what else I would be doing if it wasn’t targeted at people. Maybe I’ll be working on red Pendo soon. Another pretty cool database. Yeah, pretty happy, awesome team that we have. And we kind of just did that ZFS spirit that we love, you know, that’s just built something safe and made a contribution. Be efficient, not waste. Just a lucky time to be doing new things and think
Eric Dodds 55:44
What a wonderful answer and great to hear from someone who loves what they do. So you’re on, thank you so much for joining us on the show. It’s been delightful. And we absolutely want to have you back on. Because there’s so many topics that we didn’t have time to get into, or go deeply enough into. So thank you.
Joran Greef 56:03
Oh, thanks so much, Eric and Kostas, just being real joy is such a pleasure. I have
Eric Dodds 56:10
Two major takeaways, Kostas. First one is what an unbelievable concise explanation of double entry accounting for developers. I mean, even the analogies are so good. And I just really appreciated that. And I think it speaks to your team’s ability to take, you know, concepts that can be, you know, very complex, and have a ton of breadth and distill them down into really, you know, easy to grasp concepts. But the other thing that I really liked was that he just really seems to love solving these problems, and figuring out how to make things fast, and how to make things safe. And that was just really interesting. I think. The other thing that stuck out to me was, you know, recently we talked with the founder of tile dB, who sort of went and worked on a low level on the storage layer of databases. And you’re in said something that really stuck out to me, which was that he learned that he needed to work with the grain of the hardware, which I thought was really fascinating and reminded me of our conversation with Travis, from tile dB, you know, they’re sort of equal or not equal, but they’re sort of similar learnings there. So those are my big takeaways. What a fascinating conversation, I feel like we could have gone for two or three hours.
Kostas Pardalis 57:42
Yeah. For me, one of the things that I will keep is, from what it seems like to theorize about purpose built database systems out there, which I think is very interesting. And, okay, we see something like that happening, like in the financial sector, which kind of makes sense, because it’s a very lucrative sector, right? If you solve a problem, like really will, like, you’re going to be rewarded for that. Like, it’s, and I’m not talking about, you know, like making money for, like, Wall Street, right? It was very interesting to hear from your like, how read you see, like the core for financial transactions, sexual like CLB people to, you know, like, rise beyond like the board level. Right? So it’s very, it’s an area where it’s like an industry where you can deliver value like this valued bike. Very important, right? Well, company, we’re going to see something similar in other industries, too. We’ll see more of, let’s say, in these kinds of domain specific database systems, which leads to another observation that building databases like them has become easier. And that’s great. Like more people can go and build very complicated systems, validate them and get them to market like much, much faster loving the box, which is amazing. I hope we will have him again in the future. It’s just the beginning for tiger beetles. And I’m pretty sure both from a technical perspective they’re going to integrate, so we’ll talk to him again in the future.
Eric Dodds 59:44
I agree. Well, thanks for listening in. Subscribe if you haven’t, tell a friend and we will catch you on the next one. We hope you enjoyed this episode of The Data Stack Show. Be sure to subscribe to your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me, Eric Dodds, at firstname.lastname@example.org. That’s E-R-I-C at datastackshow.com. The show is brought to you by RudderStack, the CDP for developers. Learn how to build a CDP on your data warehouse at RudderStack.com.