This week on The Data Stack Show, Eric and Kostas chat with Matt Butcher, the CEO of Fermyon Technologies. During the episode, Matt discusses his career journey from philosophy to software development, and his work with various programming languages and cloud ecosystems. He talks about his experiences with Google’s infrastructure and container technology, and his various stops along the way before becoming CEO at Fermyon. The discussion also covers the use of WebAssembly for cloud computing, its advantages over virtual machines and containers, its potential for revolutionizing application development, and more.
Highlights from this week’s conversation include:
The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 00:05
Welcome to The Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You’ll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by RudderStack, the CDP for developers. You can learn more at RudderStack.com. We are here with Matt Butcher from Fermyon. Matt, welcome to The Data Stack Show. We’re thrilled to have you as a guest.
Matt Butcher 00:32
Yeah, thanks for having me. I’m looking forward to this.
Eric Dodds 00:36
Oh, man, so much to talk about. So give us a quick background on sort of where you came from, and what you’re doing at Fermyon.
Matt Butcher 00:46
Yep. So if we were to rewind to my high school career at time, I would have told you that when I grew up, I wanted to be a philosopher. So when I started college, that’s what I was setting out to do. But I had sort of gotten a job on the side doing some computer stuff. And philosophy degrees are expensive, especially when you’re going to do a bachelor’s and a master’s rather than a PhD. And so I ended up kind of paying my way through by writing software and doing stuff like that. And at some point, I realized that software was a lot more fun than philosophy and kind of switched career tracks. Of course, after I’d incurred lots of debt, and really went from there. And I first got interested in content management systems and did a lot of work in the Drupal ecosystem. By that point, I’d learned Java and PHP and languages like that, then I got working at HP cloud originally to do their documentation. And as soon as I got kind of a taste of cloud technologies, and what was going to be popular or possible, and what you know, I kind of had one of those moments where I saw a glimpse of the future. And I was like, I want to be part of that. And that really shifted my career. And I’ve gone on from there into, you know, through Microsoft, through Google and on into starting up Fermyon, a couple of weeks ago, a couple of weeks ago, a couple of years ago, that’s been a fast couple of days, this last couple of years.
Eric Dodds 02:03
Very cool. Give us just a quick overview of what fermion is. Yeah,
Matt Butcher 02:09
We set out to build what we saw as the next wave of cloud computing. And we thought that the foundation of that was going to be a technology developed for the browser, but that we thought was better applied on the cloud or on the server side. And that was WebAssembly. So we’ve been kind of doing the thing that we do best. And that’s it, you know, building an open source tool and toolkit that developers can use to get started. And then building a hosted cloud platform, and server side Kubernetes style application platform where people can run these things in their cloud.
Kostas Pardalis 02:41
That’s amazing Matson. I’d love to get more into these because my web assembly has been around for a while now. We’ve heard many things about it. Like many different use cases, it has been used in some, like cases also like as part of like products and stuff like that. But yeah, we still get this feeling that we’re still not there. Like WebAssembly. Like, there’s a lot of promise. And we’re still looking to see how it gets delivered. Right, like so one of the things that I definitely want to go through, like during our conversation is about that. And I’m sure you’re going to like to Shell both and understand what’s going on with the ecosystem like today. But what about you like what are like a couple of things that you’re looking forward to talking about during our like, recording? Yeah,
Matt Butcher 03:36
that one, I mean, you just hit one of my favorite topics, which is I think web assembly. You know, it has shown promise in a lot of different areas. But until I don’t know, maybe a couple of weeks ago, some of the most exciting pieces of web assembly that were not yet accessible to the general developer were all very r&d, and a little rough around the edges. And now with the component model, lending and being supported, suddenly, we’ve got a whole bunch of new and interesting things that we can build with WebAssembly. And to me, the future that opens up out of WebAssembly. And the component model is just so exciting. There’s so many interesting things we’ll be able to do from, you know, true polyglot programming to being able to overlay security models and things like that in ways we’ve never been able to do before. So I’m looking forward to talking about this. I think it’s gonna be a lot of fun.
Kostas Pardalis 04:23
Yeah, colorable sense. Let’s go and do it.
Eric Dodds 04:27
Now, welcome to The Data Stack Show.
Matt Butcher 04:29
Yeah, thanks for having me.
Eric Dodds 04:32
We love covering new subjects that we haven’t covered before. And I guess I mean, gosh, are we over 160 episodes. Now? I don’t think we’ve talked about some really key topics, like web assembly. And so there’s a lot to cover. But we’re going to start at the very beginning, as we always do. So give us your background. How did you get into the world of data and engineering and then give us an overview of what you’re doing today? it for me? Yeah,
Matt Butcher 05:01
sure. You know, when I was young, I wanted to be a philosopher. And part of the reason behind that was that I was very much interested in systems like this very elaborate system that seems to be governed by scientific laws that we’re still just discovering. And, you know, it’s a world that’s simultaneously mysterious and yet predictable enough that we can survive pretty well for about 80 Some years on average. And that was, you know, even when I was in high school, that was really an, I was really enamored with that. And so when I went off to college, I went off intending to study, study philosophy. But along the way, I happened to get a job doing some IT stuff, and then software development, and then early web development, and that I use that as a way to pay my way through school. And as I got going, you know, I advanced my philosophy career up until I got a PhD, I wrote the dissertation, I taught for a little while, but all the while I was still doing software development on the side, you know, with, you know, Java and Perl and stuff like that, then moving on into Python and JavaScript as No, Jas got popular and on into newer languages. And at some point, I had one of those, one of those mornings where I woke up and went, one of these two careers, you know, is really lighting me up every morning, and the other one is moving really slowly. And I thought, it’s time to make a choice. And I kind of said, Okay, I’m going to, I’m going to reserve philosophy for the weekend, passion projects, and I’m going to go all in on software development. So I really got going and content management systems. At that point, Drupal was kind of like the new hotness at the time. I really liked it, the new PHP, and so I could do a lot of development there. And spent years just building websites with Drupal working at various places. And at one point, I was offered a job at Hewlett Packard, HP was just getting into the cloud space. And Cloud was just at that point starting to get really popular, right? I’m going to tell it like it is right? Here’s HP, you know, one of the tried and true, you know, original Silicon Valley, powerhouse technology companies watching a bookstore takeover the cloud were called and going, Wait, Amazon can’t win this battle. We’re HP. And so they started a group called HP cloud. And I joined that group to do the CMS systems that were going to share documentation and all the marketing pages and all of that. And I was building that in Drupal.
Eric Dodds 07:32
And so I got that. Just a little bit. Sorry to interrupt. So can you just give us a timeline here? When is he realizing this is just a sort of Orient, because we live in the days of crime? Or maybe post crime when one day means three days? But right?
Matt Butcher 07:49
This must have been like, what 2011 is, I want to say somewhere around there, it’d be 2010 2011. Time man really starts to blur together. But yeah, that would have been the timeframe. Pre sort of cloud warehouse. Gotcha. Okay. Yeah. And OpenStack had just sort of come on the scene, right. So up until that point, it was sort of like Amazon had built their thing, which is entirely proprietary, Microsoft had built their thing, which is entirely proprietary. And then out of like, Rackspace and the NSA, you know, an unlikely Alliance, comes OpenStack, which promises first compute, and then you know, object storage and other forms of storage come after that networking. And it was a fun, fun time to be involved in that ecosystem. Because it’s like, every morning, you’d wake up, and brand new features had dropped, there were so many developers working on it, it was all open source, it was happening very quickly, we were maturing very rapidly. And it was just, it was a really fun time to be in the cloud ecosystem. We were all just kind of under starting to understand exactly how big this thing was going to be. You know, it’s all pre containers. So Docker hadn’t come around yet. It was a very heady time, right? And at HP, I mean, this kind of vision that, you know, this was a new area, and we could just build something that would be unrivaled, right? And catch up and then pass everybody. And we had this firm vision for where we were going, which was very exciting. So when I, when I was doing the website development for HP, I asked, you know, Can I switch teams? Can I start working more on the compute side of things to core OpenStack side of things, and gradually sort of finagle my way over from documentation, and running this big Drupal site and writing a lot of PHP to then writing some JavaScript to do JavaScript, no Jas bindings into this kind of thing. And then worked my way over into the platform as a service and ran the platform as a service team. And it was just kind of fun, like, you know, those kinds of, you know, there’s sprinklers that you know, and you hear the clicks as they switch. That’s how I felt like my career was doing. I was just clicking through a bunch of different roles until I got to the one I wanted, which was leading the platform as a service team there. And it was so much fun. But along the way, it sort of set in there was some internal hiccups. The VP that I worked for whom I absolutely loved had departed HP, we kind of lost our vision. And it was starting to look unclear where we were going, how we were going to get there. And I hit this point where I was just sort of depressed. And I guess I must have been moping around the house a lot. Because my wife was like, Maybe you should look at a different job.
Eric Dodds 10:35
Wise Man to listen to her.
Matt Butcher 10:39
Yeah, I did. Yeah, well, yeah. If she went and actually, she said, Maybe you should look at a different job. I’ve been job hunting for you. Here’s my list of several.
Eric Dodds 10:49
What a woman.
Matt Butcher 10:50
I know, right? She was amazing. And not only that, but she picked the job that as soon as I saw it, I’m like, oh, I want to do this. She had found an IoT consumer IoT startup in Boulder, we lived in Chicago at the time found one in Boulder, that was looking for a head of cloud, somebody to really help them take this thing from an early POC into a product, which was exactly the kind of work that I thought would be a really rejuvenating experience after sort of feeling ground down and worn out. And it was in Boulder, which was closer to family and also closer to the mountains. And so I can’t I flew out here, interviewed, took the job, moved the family out here and started working on this IoT backend, very awesome cloud system met some amazing engineers, we worked really hard on this kind of virtual machine driven platform that was a back end for IoT, a lot of fun. And we were having so much fun that we attracted the attention of Google who acquired us. So I went and spent some time inside of Google working inside of the next team there. And that was a really eye opening experience. Because Google’s infrastructure is just so much bigger than anything I’d seen before, even compared to what we had at HP. And they were using this kind of the early containers, and I had been dabbling with containers on the side. And when I saw the way they were doing Borg, I thought, Oh, this is just, this is just mind boggling and awesome at the same time. So that was
Eric Dodds 12:17
sort of like, you know, Docker is like, now a thing, you know. But then you sort of got to go into the heart of the beast and see how Google’s doing, right? Yeah,
Matt Butcher 12:33
yeah, cuz I think so. Google had been using lxc containers, which was sort of like the, one of the early analogues to Docker, Docker had just kind of come on the scene. They were building some interesting, but not quite production ready containers at that point. And, but Google was on the opposite side, they had this big giant container ecosystem that the user wasn’t really exposed to directly. You can upload a container there, you would write App Engine software, and it would be deployed in containers. And it was there for the boss. Yeah, exactly. Yeah, exactly. You know, no pigs behind this curtain. But the awesome thing, and the thing that you did get to see if you look behind the curtain was I had this orchestrator called Borg that knew how to take all of these containers and shuffle them around and put them in the most reasonable on the most reasonable compute platform. So Borg will come into the story a little bit later. But that was my first peek at Borg there. I made it at Google for a while. And then I got this hankering to go back to startup life. And in particular, I wanted to do the container thing, because now that I understood it, I was really excited about it. And I wanted to do more like the PAs platform as a Heroku style thing again, you know, run the infrastructure behind something like that, like what I was doing at HP. And so I found another startup in Boulder called Deus. And Deus was building an open source Heroku competitor, based on containers, and they were looking for somebody to do sort of the architectural work behind this. And I’m like, this is the perfect job for me. So I joined Deus. And about, I don’t know, maybe six or so months into working at day as Google did something that really surprised me. They dropped an open source equivalent or version of Borg called Kubernetes. And it was like 1.0 1.1, it was held together by toothpicks and marshmallows. But it was like I saw this and I’m like, Oh, yes. This is like it’s open source. Now we can build all kinds of things on top of it. So the CTO and I convinced the rest of Deus. I was an architect there, so the CTO and I convinced the rest of Deus that we should replatform our pas on top of Kubernetes. And that, you know, it was another little sprinkler kind of thing in my career because what I didn’t realize was Kubernetes was on the cusp of really exploding, and we were starting to build key pieces of Kubernetes. So we built the package manager for Kubernetes. We’re building a whole bunch of other projects for Kubernetes is and once more
Eric Dodds 15:01
the building like hello and other stuff you were doing that instead of Deus at
Matt Butcher 15:05
Deus. Yeah, we don’t call them as part of what we thought was going to be, you know, the long term das offering.
Eric Dodds 15:11
You realize now holy, okay, yeah.
Matt Butcher 15:13
I mean, okay, so. So, helm came out of a hackathon project. So we did this all hands meeting. I’ll tell you this story really quickly, because it’s kind of funny. Sorry, let’s stop diverting here. Sorry. Right. Yeah, nothing is linear with me. So, we had, you know, Gabe, and I had the CTO and I and basically said, Okay, we think the right move is to switch over to Kubernetes, in sort of replac form on Kubernetes. And Gabe said, you know, we’re doing this all hands meeting, I really want you to, you know, come up with some things we can do to get people going on Kubernetes. And so we decided we’d do a hackathon and decided we’d do a session on, you know, what Kubernetes is, and we lined this all up. And so that the hackathon, the idea was we kind of challenged people, Hey, build something fun and cool. It’s sort of in this new cloud ecosystem. And the winner, the winning team will get a $75 amazon gift card and the average team was three people in size. My mind, Jack, Remus, and I were the three who worked on this. And so we sat down and did some brainstorming. I was telling him, you know, we’re trying to figure out how to install our new day as pas on top of Kubernetes. And we ended up talking about NPM, and package management, and we decided we’d build a package or package manager for Kubernetes. It was called Kate’s plays, K eight s plays. And it was coffee shop themed. And so it was all you know, we had this whole capes, places, this nice coffee shop where you go and you get little shots of Kubernetes installs and stuff. And we just skipped the team dinner we did. We worked all night, we worked the next day, built this little demo of capes and placed the package manager for Kubernetes. And we demoed it the next day, and we won the $75 gift card for Amazon. I blew my 25 on coffee. So the offside ended, we all went back to our homes. And the next day, I got a call from the CEO and CTO of Dez. And they’re going so you know that package manager thing? Positive?
Eric Dodds 17:14
Medicine, like Yeah, that’s a really
Matt Butcher 17:15
good idea. I am a package manager for Kubernetes. That’s an idea that’s got some momentum behind it, we should do that. I think you should start building that as your full time job. And we’ll give you a team. You know, you can pick a couple of people to be on the team and get started building that. I mean, this is like, this is what we all dream of when we do these hackathon projects. It’s like, hey, if I could invent my own day job, I would do this. And here, I was basically getting, you know, carte blanche to do my little idea. And it was fantastic. But they said, This is just one thing. And I said, yeah, what’s that? They said, we really hate the coffee shop theme. So like all of the things to be, you know, devoted to the name was not one of them. I’d rather build the software I want to build. Yeah, so yeah. So Jack, and I jack, for instance. And I sat down with a non, the other one of the other people in the hackathon team sat down with a nautical dictionary and started reading it out loud to each other trying to come up with a name. And that’s where we came up with Helm. That’s where we came up with calling the packages charts was all just sitting there reading this little
Eric Dodds 18:12
dictionary. Story. Yeah. That’s
Matt Butcher 18:16
right. So the next time you get an opportunity to do a hackathon, do it.
Eric Dodds 18:21
And I am totally okay. Well take us I mean, that that was an amazing detour. But so take us from that point. And then how did you get to fermion? And tell us what for me honors? For sure. Yeah. So you
Matt Butcher 18:37
Now, Helm. And the other things we’re doing in Kubernetes land attracted the attention of Microsoft, who was trying to, you know, Brendan Burns, who created Kubernetes, left Google and went to Microsoft and started building a team. And part of that effort was, was them acquiring Deus and rolling us into the Azure part of Microsoft, I had a fantastic job there. My job there was that I got an open source team. And my mandate was, you know, find what’s missing in the container and virtual machine ecosystem and build it and open source it and contribute up to the CN CF, the Cloud Native Computing Foundation, governing group for Kubernetes, and the like. And it was fun, and we had a lot of fun. But one of the coolest things about a job like that, is that you’re always out there asking questions of people, you know, customers, other teams inside of Microsoft and so on. What are your big problems? Right? What can you not do? Where are you struggling? Why? What are the roadblocks that are preventing you from migrating workloads to Kubernetes? Are things questions like that, and you’ve and then you get these challenges back? And you just try and build solutions to them. And some of them it’s fairly straightforward and you build solutions like Oh, am I like a brigade that just kind of answers people’s questions. But some of the problems were really vexing when really we could not figure out good solutions. One of them was we really wanted to be able to scale workloads to zero. So you know, When you’re dealing with huge amounts of computers, during peak time, you might be consuming like nine different virtual machines. And during low times, you might be consuming not right, you might have no traffic in the middle of the night. So you could, you should really be able to scale from zero on your workload up to being able to handle 10s of 1000s. And as close to instantly as possible. And but scaling is bound to the problem that by when requests come in, you either have to be able to start up really fast, or you somehow after anticipate ahead of time that when the requests are going to come in and scale up before the traffic starts to go up. If we were all good at predicting the future, you know, the stock market would be no fun, and neither would gambling. But so we took the approach that we needed to come up with a faster way to do startups. Another problem that we ran into around the same time was a lot of developers were telling us that building Docker containers was cool, except they had to know ahead of time, what the operating system and architecture of the target environment was going to be. And then oftentimes, they had to do really ugly cross compilation steps if it was different than their so if I’m writing code on Windows, running an Intel machine, running on an Intel architecture, and I’m deploying to Linux, on an ARM architecture, my deployment life is going to be kind of hard. And so we were looking for what Java promised at the beginning, a compile once run anywhere style of thing for cloud workloads. So those are a couple of the examples of things that we were working on that we just couldn’t figure out. And so at one point, we started saying, well, we can’t do this with virtual machines. And we also can’t do this with containers. And we’ve been trying this for months, if not years, maybe we should open ourselves up to the possibility that there’s a third kind of cloud computer that nobody has started using yet. What would the characteristics be? Well, it would have to have a cold start time that was like 10 milliseconds or under, so that we could rapidly scale up when load came in. And we can rapidly scale down without worry. When load left. We want it to be cross platform and cross architecture, of course, it has to have a really good security sandbox model, because that’s essentially what a cloud runtime has to guarantee for you that you can run that you as the operator can run untrusted code from anybody else who’s willing to you know, pay the subscription fees. And you can do it without risk yourself or risk that they can attack other tenants in your environment. And so we had to approach this problem this way. And then began looking at potential technologies that can solve it. And that’s kind of what led us up to, first of all, discovering WebAssembly, which was originally a browser technology. And then second of all going, Wait, we’ve got an idea here. And we have pretty much a team of amazing experts in this field. Maybe we should do the startup thing. And so you know, a couple of years ago, we started fermion technologies with the idea that we could build this next wave of cloud computing using web assembly as the platform.
Eric Dodds 23:03
Oh, love it. Okay, can you let’s start with a couple definitions. I know Costas is chomping at the bit with a bunch of questions, but let’s just do a couple of definitions before I hand the mic off. WebAssembly. What is a very good net worth? We actually don’t think we’ve talked about this on the show before. So this is like a first sort of definition, which is exciting. So
Matt Butcher 23:29
Yeah, but it’s exciting.
Eric Dodds 23:33
To me, yeah. Yeah, no pressure, this is just a conversation. What is WebAssembly? And why is it important? Yeah, he will give
Matt Butcher 23:43
the most boring definition of it. And then out of that kind of unpack why it’s actually pretty exciting. The most boring definition of it is that web assembly is a binary format that you can compile different languages to. So you know, if you’re compiling natively on Linux, you’re compiling to the elf format, right? And you’ve got separate compilation targets for every well, probably every operating system out there, but it’d be you know, at least the big three or four, I suppose others probably borrow. So we’re going okay, so that compilation process is part of what introduces the cross platform cross architecture problem that we had seen. But if you find a binary format that could run on any architecture, and any operating system, and you’d had the right security sandbox, then, you know, those were two of the really big checkboxes on the list. So WebAssembly happens to also have a couple of other virtues, I should back up and say, What was WebAssembly originally designed for? Because once we understand that, then we start to see why this story is so interesting. WebAssembly was originally designed to run in a web browser, and the original intent of web assembly if you go back to 2015 when Luke Wagner and a group of people at Mozilla started it, the stated goal was to build a platform neutral binary format that can run inside the web browser. And that different languages can compile to, so that in the browser, we can run other languages side by side with JavaScript. So you can imagine some of these use cases, right? I’ve got this crufty C library that’s been around since before I was born. I don’t want to have to rewrite this in JavaScript. But I also know that it does something important. Wouldn’t it be cool if I could compile it to something that I could run in the browser and make function calls from JavaScript into this C library? Yep. That’s the kind of cases that were in the original scope of WebAssembly. Yeah. figma. In fact, if you’ve ever used figma, and some of the other Adobe, I think, also, they use WebAssembly in browsers to be able to write code in C++, compile it to web assembly, and then use JavaScript to kind of call it. And that’s how they get such great performance on all their vector drawing because some of that’s going through C++, not through JavaScript. fascinating. So that was sort
Eric Dodds 26:04
of like a transformative experience. If you transition between the web app and the desktop app, which are, you know, obviously, under the hood, like, yeah, absolutely the same thing. It is crazy. While like, it’s pretty, anyone who’s used design software, which I’m not a designer, but you know, Brooks knows that I will get into some design files Much to my design team. But it is actually the thing that I noticed the most that is absolutely unbelievable is that it is a wireless experience, and is so fast, like it’s Yeah, dealing with some pretty large files.
Matt Butcher 26:46
Yeah, and some pretty complex on the flight calculations too because you can drag, resize things very quickly and not have any kind of lag, like we used to see in sort of the olden days of the web. Sure. And so but when you think about how then something like those figma libraries would have to run in a browser, particularly if you’re thinking sort of generically about this and not in the case of one particular application, there are about four features that you would really want. The first one is a sandbox, you want a very strict security sandbox, because again, you know, the browser is running binary code that it has not inspected inside of an environment. So not only do you kind of have to be able to protect the system from getting routed by, you know, gnarly binaries that you downloaded, but you also have to protect the JavaScript sandbox, because that’s an attack vector. So the sandbox that you have to design for WebAssembly ends up having to be very good, and very reliable. So and which, of course, one of the check boxes for the cloud, we want that same level of reliability. Another one is, we are notoriously impatient. When it comes to waiting for web pages to load on the internet, right? We want them to be snappy. Some of the research suggested that 100 milliseconds, one piece of research I read said, within 10 milliseconds, people’s attention actually starts to dwindle. Which is remarkable, because that’s way before we’re aware of our attention starting to drift, but that’s how impatient we as human beings are. So the WebAssembly sandbox had to be very fast.
Eric Dodds 28:16
Maybe that’s more flexion on society than technology. But we’ll save that because I want you and cost us to discuss some of this process as a philosopher, and that’s your train. There you go. There you go. We’ll save that for later. So, yeah, was warspear the societal? Yeah,
Matt Butcher 28:35
yeah. So we got two more on WebAssembly. It has to be cross platform and cross architecture too, because you know, we want to be able to run, you can’t have it where figma works on one operating system. And, you know, I opened my Macbook and one and it’s like, sorry, this processor is not supported. That’d be a horrible experience. So the binary format also had to be cross platform. And then the last one was really the most audacious of all of them. And that was that the format was designed so that any language could in theory be compiled to it. And that’s pretty wild. Because essentially, what the precondition for success of web assembly was is that they would be able to rally enough language communities that we would actually get WebAssembly support in languages from C and C++ to, you know, rust and zig and go to Java and dotnet, and Python and Ruby, and you know, and all of that. Right. And it’s remarkable, we bought into that we bought into that early in fermion. But we were also identified as our first major risk. If that didn’t really take off, then we would be in trouble. And I think Costas, were talking a little when we went, you know, we when we chatted beforehand about how WebAssembly has, you know, sort of seemed to have fits and starts as it’s gotten going. And one of those has been, you know, early buzz was not fulfilled when there weren’t enough languages when you could really only write in C and rust. It wasn’t terribly compelling in in the last year in a calendar year 20 In 23, we have gone through language after language adding support dotnet as piloted support for all of the dotnet languages, Python and Ruby have added support, Dart and Kotlin are coming along, you know, and languages like rust and C++ and zig and all of those continue to mature Swift is moving along, it’s like, whoa, the most ambitious part of all of WebAssembly is actually happening this year. And that’s been really exciting. So you can kind of see there were four little attributes there. They were designed for the browser, all four of those ended up being really important in satisfying the conditions, we were looking for a cloud runtime. And in particular, we did kind of skip over this, the workloads that I was most interested in, when we were looking for this third wave of cloud compute, were what we would call serverless, or fads, or Functions as a Service, the kind where we wanted to do a discrete step, started up, bring it to completion and shut it down as fast as possible. So the most simple way we can think about this is, hey, a user makes an HTTP request, we answer the request, send back a response and shut back down right away, and then we’re not running any long running processes. So that whole scale zero thing just sort of automatically falls out, right? When load is coming in, we might have 10,000, web assembly functions firing off answering all these requests, when you know, 2am rolls around, and everybody’s asleep. We can scale, you know, there’s nothing running. And essentially, we’re not paying a computer bill. So that was kind of one of the workloads we had really targeted as being perfect for this third kind of cloud computing, which web assembly then turned out to be a pretty good example of,
Kostas Pardalis 31:35
Yeah, that’s great. Quick question here. Because, okay, I think one of them, what makes WebAssembly, like a little bit confusing to people out there who haven’t been getting, like active in web assembly itself, there’s so many different use cases, right? Like from someone like that, who listens about like, all the sudden have like the security, we have the server list. model, we have like the polyglots part of the show. And we have web bolts, right, like, but let me, let me ask you the following question. So one of the ways that you position it, like as part of the computer in general is like next, like to containers, right. And like some, like fixing some of the problems that like containers, like traditionally, like hot, right, and you have a couple of like different primitives here. We have containers. Before that we have virtual machines. Now I have WebAssembly. And we also have micro VMs. Right? So like with things like, for example, like firecrackers, which gives you the opportunity to solve some of the problems of like, like, cold, the cold, start probing, right, like fast systems and all that stuff. What is the difference between them? Like, and how do they fit, let’s say, in the infrastructure wars? Do they, let’s say, compete or complement each other? That is,
Matt Butcher 33:12
I think that right there at the end is a fascinating part of this whole thing, right? We’re building this big cloud world. And every time we introduce a new technology, it kind of competes. And it kind of complements. And I think if we look at virtual machines, right, so what does a virtual machine do? Well, what is it for a virtual machine runs an entire operating system from the kernel and the drivers all the way up through the libraries and the utilities and on into your user land code. And it packages, you package all that stuff up, and you ship it off to somebody else’s hardware, and you execute it there in the cloud. And so you’re really thinking like, soup to nuts, the entire operating system. Now, that’s great for a number of workloads. Some of them are whichever you’re running large scale databases, things where being able to tune up the kernel parameters or the driver parameters is really important. You can use these things and be highly effective. But I as a developer, and I think many developers out there, regardless of what domain you’re working in, are gonna Yeah, but they’re no fun to build. They’re actually really hard for a developer to build because it requires a tremendous amount of operational knowledge to assemble them. And then they’re very hard to maintain. So really, as a primitive, they’ve worked very well for platform engineering, and DevOps, and teams like that, who are focused on the operation of a system, but they weren’t as popular for developers, and that’s where containers came in. So a container does not have a kernel or or low level drivers, right. A container is just sort of like a little pie slice version of an operating system. It has just the part of the file system, your application needs, just the supporting files, it needs, just the system libraries it needs and your binary and it’s good rate for long running processes that perhaps don’t sort of need that low level access to the kernel and don’t really optimize at a low level. So you can think, you know, web servers, microservices, those kinds of things work great in containers. And developers, we like them, because they are a lot easier to build, you write a Docker file that just plugs your binary file inside of one of these images, and it packages up, and then you can ship, you know, instead of a six gig or 20 Gig virtual machine image, usually you’re talking about maybe 100 meg of, of slices of operating systems, that you’re pushing and moving around. And those are really good for long running server processes. It was the next class of computing that was serverless, one that I was talking about, where really, you don’t want anything long running, you want a process that gets started up, when a request comes in, that handles the request, returns a response and then shuts back down. The typical container takes about a dozen seconds to start, the typical virtual machine takes a couple of minutes to start. So you can’t really effectively start up, handle a request and shut down when that’s the characteristic of your underlying runtime. So the way this was solved in sort of like serverless v one worlds, right, with early Lambda, and all of all with Lambda today, Azure Functions, Google Cloud Functions, things like that, is you essentially prewarm virtual machines and keep a huge queue of virtual machines around. And then as requests come in, you drop a workload on the virtual on a pre-warmed virtual machine, execute it and tear the whole thing down. So it’s inefficient, and it’s actually fairly expensive to operate. And that was, you know, seeing how this works behind the hoods. And behind the hood in Azure was one of the reasons why we identified this as an interesting problem to solve. Because anytime we can reduce the amount of energy consumed and drive down prices, and free up computing resources to do other things, you know, from the perspective of someone like Azure, or Google or AWS, this translates directly to not just cost savings, but actually being able to do more with the compute power they have available. So essentially, you can sell faster if you can do this kind of thing. For us as consumers, right, it’s really about the fact that we’re only paying for traffic when the workload is actually happening, right? When there’s traffic coming in, then we’re watching our function startup run to completion shut down, when there’s not traffic coming in, we’re not paying anything. And so it’s compelling, really, on both sides of that story. Micro VMs are another attempt to solve a similar problem here, playing on this idea that maybe you can strip down a virtual machine to the point where it starts up in just say, you know, several 100 milliseconds, a lot of that is very promising. And for some kinds of workloads, I’m pretty excited about that. And we use it a little bit here and there. But if you compare, a typical AWS Lambda function takes about 200 to 500 milliseconds to cold start and and then that’s the amount of time it takes from when the request comes in, to when your code starts to execute. It’s all warming, right? That’s fast, compared to several seconds for a container. But it’s slow. If you’re talking about a user request, right, Google starts to ding you on your PageRank, if you exceed 100 milliseconds before delivering your first bite, if it takes two to 500 milliseconds, just cold start, before you’re even doing your processing. You can’t build the kind of high performing system that you want for user facing web applications. So when we looked at web assembly, one of the key things there was can we get it to start up really fast. And right now, you know, originally, we were at 10 milliseconds, then when we were really spinning at one point, we were at one millisecond, when we really spent 2.0. Last week, we were down at half a millisecond or less to cold start, that is the time it takes from the record when the request comes in to your code being executed was under half a millisecond. And that gives you the developer about 100 and some about 100 milliseconds to try and get those first bites back to Google and score high on page ranking, very high for responsiveness. If you’re doing anything like streaming or things like that, where it really matters. This is a big deal. This is a very big deal.
Kostas Pardalis 39:08
That’s amazing. Okay, I want you to like now to put your philosopher hearts and give actually, like an answer as a philosopher to, like, engineers, right? And the question is, how much abstraction over the hardware is too much abstraction because we’ve talked about virtualization in like, so many different levels, right? And I wonder at what point maybe there’s no point right, maybe abstractions like eternity like ad infinitum, like kind something like we should be doing right. But I want the answer from not the angle of the engineer here because as engineers like we thrive in abstraction, right like that. How we are late, like we want the abstract to, we can build one thing and like to apply good like many things, or we don’t have to do with like, like in it again, but from a philosopher’s point of view, right? Like, how you would say to your engineers hides to stop abstract things.
Matt Butcher 40:17
Yeah. So abstraction comes with a cognitive cost. And that’s the most important thing for us all to remember. Right? And so if you look at the discipline in philosophy that most deals with, trying to understand the structure of the world is called metaphysics, right? And if you rewind history all the way back to the very earliest philosophers, you know, Plato and Aristotle did, both of them worked very much in this field of metaphysics. What kinds of things is the world composed of? In fact, Aristotle coined the term metaphysics, because he said it meant, what must come before physics, what do we need to understand about the world before we can understand how the pieces of the world are interacting? And he said, you know, what we need to understand is what the actual structure of the world is, what kind of stuff is the world composed of? And how complex are the sets of rules? And what is computer science, if not applied metaphysics right? Here, we have this ability to build systems that are based on the way we think about the architecture of things, what is a shopping cart? And what is an online store? What are the components I need in this, and then we start building the rule systems around them and how they work together. So in a way, your question is perfect, because the history of philosophy can inform exactly what we’re trying to do in computer science. And what you see from Plato, onward, is metaphysics going through these cycles of getting increasingly complicated. And then getting to the point where they’re out of touch with reality. That is, it’s so impenetrable that it’s hard to even test whether you’re describing reality anymore or not. And then after that, you start to see him retract again, and you get movements like empiricism, or stoicism, or even skepticism, the idea that all metaphysics is doomed, we might as well just live life to the, you know, as it is, and doubt that we actually know anything, all of these movements are kind of a reactions against the fact that metaphysics can lead to systems that are so complicated, that and so hard to even test whether they are actually describing the world that they become essentially, either useless or vacuous, right, either there’s nothing we can do with them that’s productive. Or they’re so difficult to explain that by the time we’re in that sort of like enlightened cogitation about them. We’re not really talking about anything people care about. I think that particular play in philosophy that we’ve seen now over 1000s and 1000s of years, should inform the way that we build systems and software. Because, to your point, what is an abstraction for it is to well, a programming language, right, that the nuts and bolts of what we are doing as software developers, is attempting to build a language or languages that help me describe to you what I’m thinking, and how both of us describe to a computer, you know, a pure deductive logical system, how to execute things in a step by step way. So we’ve got kind of dueling objectives here on one side, it’s how do we make sure that we are explaining it at the level of terseness that the computer needs to be able to execute it. And that’s what compilers are for. But it’s also, you know, part of the reason why some of our languages have peculiar concepts, like the borrow checker in Rust, or, or type systems in languages like Java. But the other thing is you and I have to be able to communicate effectively on our code, right? If you and I are working together on a code base. If I write code that you don’t understand, I’m making a mistake. And likewise, if we call if the two of us get together and start building these grand edifices that use all kinds of specialized terminology, we build lots and lots of layers of abstraction. And then Eric comes in and looks at this and is like, I don’t even know where to start. Right? This is so complicated. I have no idea, then we’ve failed as software engineers, right. So that’s the framing for the answer. The next question you really have to ask is, Well, the thing we can do is we can solve this problem by introducing abstractions and specializations. And that’s what I think has happened, right? We have data stacks that are designed for data processing. We have web stacks that are designed for web developers, that you know, we have IoT ones for IoT developers. And we’ve managed to do a reasonable job of carving up our day jobs, such that we can have some divergences in their terminology. We can use terms like Node and every one of us thinks a different thing when we hear that because we applied it in different ways. And we can have some success there. And we can actually look at science and see that science has been relatively successful where it started out as a unified, unified discipline and has since broken out into sub disciplines like physics and biology and stuff, and then broken out into further sub disciplines like astrophysics and things like that. And there’s been some success in doing things that way. But each time we do that, we introduce a new level of complexity, which we have to acknowledge when we do it. When I introduce this new level of complexity, I’m essentially saying, either there’s gonna be a new specialization that comes out of this, or I’m gonna end up, you know, making this too complicated for a person. So I don’t know if there’s a strict answer to your question, but at least there’s kind of a framework for thinking about it.
Kostas Pardalis 45:35
No, and I think it’s a great framework to ask, like my last question, which has to do and go back to the WebAssembly. Like, context again. So when I try, it’s, and it’s been a while, I have to admit, I like to play around. Okay, I experienced a lot of like, I love like the word that you said, like the cognitive, like, course that I had like, to go and like, figure out, like, what I can do with these things, right? The beach was great. All these things are like, Oh, now maybe I can take Python codes, for example, and run it as part of like, my rust code, or like, vice versa. That’s great. That’s great. Like, I’d love to be able to do that, especially for me, as a person who comes from, like, data infrastructure. And I’ve seen how big of a moat for old systems has been the fact that a lot of code has been written like in legacy systems, right. And we cannot just move it easily like to a new one. Right. So just moving UDFs from like, Hive to spark would be amazing, right? Yeah, they would be like, you, I don’t think people will realize how many millions of dollars will be shaved by doing that. Right? Yeah. Yeah. But when I started, like, playing around, I got lost. And then I gave up, right? And the reason I’m saying my personal story is because I think I’m not the only one out there. And I would like to ask you why this happened. Like with web assembly, why we had, like this process where like, okay, the promise was, like, really big, like, people were like, really, you’re about, like getting into that. But it feels almost like, we’re still waiting to see the outcomes of that like to see them applied, right, like, what was missing? And if it’s not missing anymore, like what happened.
Matt Butcher 47:40
And I think the answer to what was missing was, is typical of many systems. And maybe web assembly got a little more hype, than we thought it would a little bit faster than we thought it would. But the early tooling for web assembly was actually fairly difficult to use. And you might follow a set of instructions, it was like download this library, put it in this place, download this tool, we’ll tell you later, what this tool does trust us download and install it, you’ll need it, you know, that kind of thing where you’re like, Okay, step 15, install the web assembly compiler. Now I can start writing my first piece of code. You know, that was an experience that was non ideal. And that was the way things were when I started working. When we started fermion. That was the way things were. And one of the first things that fermion did, we said, Okay, our first user story has got to be as a developer, I can go from blinking cursor to deployed application in two minutes or less. And that was exactly because of the problem, you described that we came up with that user story. First, the first thing we have to prove is that this is easy. And that the developer doesn’t actually have to understand the web assembly bytecode format, or what a runtime does, or which tools are used to assemble a thing this way, they just need to be able to write code and kind of their usual way. And web assembly is still there, some of the standards are still in flight. So for example, networking is not fully baked yet. So there are some things we know are still going to be a little hiccup for users as they get going. But for the most part, by taking that perspective, you know, we spent most of 2022 and part of 2023 going, we just need to get a developer experience where you can do, you know, just a couple of commands. So for us, it’s like spin spin is our open source tool for developing web assembly serverless applications. And so you can spin something new, tell it what language and give it a name, and it’ll scaffold out a project for you. So spin new rust, you know, foo, and then spin build will compile it for you. So you don’t have to know all the compiling commands for each and every language. And then spin up will allow you to test it locally and spin deploy will allow you to push it somewhere else and run it. And we thought if we can build an experience, it’s that simple, then developers can trust us that we’re not going to just overly burden them with a whole bunch of new things they have to learn. So I think we’ve made good progress on that and 2023 the component model is one of the things we are most excited about. You alluded to it there and it’s new and spin two which came out only Well, first week of November is when it’s been 2.0 came out And the component model is the first step against the trend you described. So we waste huge amounts of time in this discipline reimplementing the same thing and lots of different places and lots of different languages. And that component model allows WebAssembly binaries to talk to each other. Or more specifically, it allows a binary to say, these are the functions I export. And these are the functions I need to import. And then you can start negotiating how you put these together, right. So essentially, binary web assembly binaries can work like libraries. So I can say, hey, I need to import this thing that provides the YAML parser. I’m going to use it, I don’t care what language it was written in. So suddenly, we start saying, Alright, it doesn’t matter if my libraries are in Python or Rust or JavaScript, I can still use it from my Dart program, or something like that. And that’s the world that we want to get to, because then we can start reusing code, instead of having to rewrite code. And then instead of having nine different YAML parsers, every one with different divergences from the spec, every one with different bugs, we can concentrate on writing one really good one in a language that’s well suited for it, like, say, rust. And then when it comes to like aI libraries, we can use all this stuff in the Python ecosystem, even if I’m writing code in JavaScript, or TypeScript. And that, I think, is a step away from complexity. And we just now like literally, within the last few weeks have gotten past that milestone. So I think from here forward, my hope is that as you start looking at this tooling, as it evolves over the next several months, this stuff is going to get easier and easier. We’re not quite as easy yet, we got it’s easy to build your first web assembly application components are still a little bit hard to assemble. So the next thing will be how do we make it easy to build applications out of components? And then at that point, I think, we can start telling a very compelling story that we can build a less wasteful, more fun way of building applications based on, you know, WebAssembly component binaries, instead of lots and lots of different languages, and lots of different libraries. Yeah, fascinating.
Eric Dodds 52:03
I mean, I think that this is a really good story around how consolidation needs to happen at a lower level in the stack, because the requirement of different teams and different jobs to your point is that, well, something may need to be written in Russ, right. But something else may need to be written in JavaScript, right? That in terms of the runtime, that really needs to be the layer where sort of everything comes together, which is fascinating. Well, yeah, we’re the mother of your map, I do have a personal question for you, which I’ve been, you know, waiting to answer. Or waiting to ask, because I want to hear your answer. So, in high school, you wanted to be a philosophy professor, which is fascinating, too, for sure. Because you were interested in how the world operated? My question is, why did you choose philosophy instead of sort of what we would call the harder sciences, right? As a software developer, I probably would have put my money on you going with, you know, more of like a, you know, a mathematics degree, or biology or chemistry, because those are concrete ways to describe how the world works. But you know, with philosophy, yeah.
Matt Butcher 53:32
Every philosopher would have been offended by your question, because a philosopher would say, but where do you think science came from? It came from philosophy, right? And that, I guess, part of it to me was like, there was that there’s this sort of like, the rudimentary part, right? I wanted to see how far back I could push it. And I didn’t know I didn’t understand a lot of this in high school. And in ways I got lucky that my naivety about things led me into a discipline that really did help me think through this. But you know, that we were talking about the difference between physics and metaphysics. And that was sort of the thing for me, right? Like, I don’t want to know how a mechanism in the world works. I want to know how the world works. And philosophy seemed to be the most promising way to do that. I think a second thing is that we have this notion of wisdom. And wisdom is a hard thing for us to wrap our heads around. What is you know, how, but I think the way that you see that in in Plato and the dialogues of Socrates, wisdom comes kind of comes across as you know, that ability to ask questions and admit that I don’t know the answers and be open to kind of hearing the answers contrasted with knowledge, which is when you do know the answers, and it’s about applying the answers. There was something about that definition of being wise as being you know, as a description of being a continual seeker, right, someone who’s continually asking questions and collecting little tidbits and trying to evolve their own view. That was very enticing to me as a young person and it’s still or even today. That’s the kind of thing that gets me excited about philosophy as a discipline. Love
Eric Dodds 55:05
it. Man. It’s been so great to have you on the show. We learned so much. Thank you for introducing us to a new topic that we haven’t covered. Thanks. And making a couple of philosophical questions from us.
Matt Butcher 55:19
Yeah, that was a lot of fun. Thanks for doing that. I had a fantastic time.
Eric Dodds 55:24
We hope you enjoyed this episode of The Data Stack Show. Be sure to subscribe to your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me, Eric Dodds, at eric@datastackshow.com. That’s E-R-I-C at datastackshow.com. The show is brought to you by RudderStack, the CDP for developers. Learn how to build a CDP on your data warehouse at RudderStack.com.
Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
To keep up to date with our future episodes, subscribe to our podcast on Apple, Spotify, Google, or the player of your choice.
Get a monthly newsletter from The Data Stack Show team with a TL;DR of the previous month’s shows, a sneak peak at upcoming episodes, and curated links from Eric, John, & show guests. Follow on our Substack below.