Episode 176:

The Fundamentals of Event-Driven Orchestration and How Generative AI Is Shaping Its Future with Viren Baraiya of orkes.io

February 7, 2024

This week on The Data Stack Show, Eric and Kostas chat with Viren Baraiya, the Co-Founder and CTO of orkes.io. During the episode, Viren discusses the evolution of orchestration in the context of AI and large-scale systems. The group discusses the transition from Viren’s work at Netflix to founding orkes, the challenges of integrating AI into applications, and the importance of orchestration to manage these complexities. He also highlights the non-deterministic nature of AI, the need for guardrails, and the potential for AI to change technology interaction. The episode also covers the recent move of Netflix’s Conductor project to a community foundation, the future of AI in business and its impact on job creation, and more.

Notes:

Highlights from this week’s conversation include:

Viren’s background in data (0:39)
Evolution of Orchestration (1:52)
AI Orchestration (3:00)
Understanding Conductor and orkes (6:26)
Event-Driven Orchestration (8:10)
Viren’s Transition to Founder (12:27)
Non-Technical Aspects of Being a Founder (15:50)
Democratizing AI for Developers (18:16)
The evolution of microservices orchestration (21:56)
Challenges in appealing to the 99% developer group (24:32)
Value of orchestration for developers (30:31)
Role of orchestrators in managing faults (37:37)
The intersection of AI and orchestration (40:27)
Evolution of AI (44:04)
Thriving in AI Environment (47:58)
Final thoughts and takeaways (51:25)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcription:

Eric Dodds 00:05
Welcome to The Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You’ll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by RudderStack, the CDP for developers. You can learn more at RudderStack.com. We’re here on The Data Stack Show with Viren Baraiya. And it’s so great to have you back on the show. Amazing that it’s been I guess we can say yours about a year and a half. So thanks for giving us some time here in Exeter.

Viren Baraiya 00:40
Nice to be here. And thanks for hosting me.

Eric Dodds 00:42
Absolutely. Well, you’ve covered quite a bit of ground since we last talked, but can you just give us a quick overview of what you’ve been up to for the last year and a half and the company you’ve been building? Yeah, absolutely.

Viren Baraiya 00:55
So I remember the last time when we chatted, you know, we just kind of had come out of the stealth mode, we were still kind of focusing on building the product. We were basically taking the conductor and you know, building otters to, you know, make it available on enterprises on various clouds. And of course, yeah, it almost feels like, you know, cloud decades that no, we started this. But yeah, in the last couple of years, you know, we have kind of built out the product that works in all three clouds, built partnerships with other cloud vendors and have customers on board, which is great, keeping us busy. And fortunately, or unfortunately, they are across pretty much every time zone. So you know, also keeping us busy all the time. But it has been an exciting journey, building a company from ground up and, you know, going all the way from zero revenue to some revenue. Absolutely.

Kostas Pardalis 01:52
Yeah, that’s amazing. And unlike last time, we talked a lot about microservices. Yeah, illustration of microservices, right. And orchestration in general comes up more and more often lately. They’re like, a lot of conversations about when it comes to like to, let’s say more of like the application development layer and like this fusion of database systems like transactional systems together with orchestration. And then of course, there’s AI, right, which anyone who has tried to build something around AI, they definitely know that. It’s all about how to guard all these different models and services, like a consistent result. So orchestration is becoming very hot and like an interesting topic, and a broader topic of like to what we were discussing a year and a half ago. So I’m very excited to talk about that. And also hear about the evolution of orkes from back then to today, right? Yeah. What about you? What are you excited to talk about today?

Viren Baraiya 03:00
Yeah, I think as you rightly pointed out, right, like, when you think about orchestration, you know, the Hubbard roots, right? Orchestration has been around for a long time. But lately, you know, it has kind of taken its own kind of form when it comes to AI. And just like everybody else, I’d like to think that’s one thing that is very exciting that is happening, and you know, how that blends with orchestration? And you know, where, I mean, now, we hear a lot about our AI orchestration, right. That definitely was a thing before, but nobody talked about it. So I think overall, like I think the entire orchestration space is kind of evolving very rapidly. And you know, where it is going, I think that is certainly an exciting place today to be.

Kostas Pardalis 03:39
Yeah. 100%. So I think we have a lot of very interesting things to talk about. What do you think, Eric?

Eric Dodds 03:46
Let’s jump in,

Kostas Pardalis 03:47
let’s do it.

Eric Dodds 03:48
We’re in. It’s amazing that after, I mean, almost exactly a year and a half, you know, give or take a week or two, you’re back on the show. And just yesterday, before we recorded this TechCrunch published an article about Netflix, abandoning the conductor projects, which you helped to build inside of Netflix, and orkes, your company that you’ve been building for the last couple of years? Forking it and taking ownership. So love the timing on that? Can you give us a little bit of the backstory and sort of tell us you know, tell us about that news?

Viren Baraiya 04:28
Yeah, absolutely. I mean, the title is a little bit Beatty, but I have been working with Netflix for a while. And the idea was this right that you know, conductor has become very popular as an open source project. Today, when you look at the number of companies using it. These are like you know, who’s a who’s who have a lot of tech companies and large enterprises, right. And supporting them as a community is definitely a full time job. It becomes a challenging thing. So you know, we kind of stepped in, started working with Netflix, and the overall goal there was that like, you know, how do we enable the community To be kind of, you know, partners here, right? How do we kind of get there, get them to be more excited about it, you know, get them to give more ownership stake into the product roadmap, how everything kind of moves. And the only feasible way there is, you know, essentially, you know, you create a foundation around it and, you know, get everybody to participate. So I think that’s basically what we decided to finally do. It took us a while, because, you know, you were to kind of go through all the legal and other kinds of things. But the good thing is, it has happened now. So you know, there is an exciting place, you know, the initial feedback from the community also is very encouraging, because suddenly, they feel like, you know, now they have the ability to kind of be part of the community, drive the project and everything. And overall, I think it is just going to make the project much stronger. In terms of, you know, its adoption, its visibility, and you know, how the community can contribute. So, it’s very exciting. It’s super exciting for us.

Eric Dodds 05:56
That’s great. I think maybe what happened with the clickbait title was that, originally, it was Netflix hands off. But then, you know, they asked the OEM to write something that would get more clicks, and they changed hand off to a band. Well, let’s just, you know, for the listeners, who didn’t catch our last episode, which if you didn’t, you should go back and listen to, just give us a breakdown of conductor and orkes, and just describe what the products do. Yeah, absolutely.

Viren Baraiya 06:27
So the conductor, essentially, is an orchestration engine. And orchestration, of course, is an extremely loaded kind of term, right? It means different things to different people and personas. But the core of it, you know, the connector was designed and built to build event driven applications, write applications, that responds to events that are happening in the business context. And this could be orchestrating micro services, orchestrating different kinds of events, you know, messaging buses, and things like that. So that’s what the conductor does. And it does it very well, in terms of, you know, handling both business and process complexity, as well as, you know, being able to handle it at a much, much larger scale. And orkes music was founded to kind of take conductor and, you know, provide an enterprise version, and kind of realizing that, you know, there is a need and a demand for a product like this, but at the same time, you know, just like, you know, for example, you know, the Linux is completely open source, you can just go to karma.org, and build your entire Linux, and you know, get all the GNU projects. But for an enterprise, you probably want to work with a vendor to kind of get everything ready. Right. And then this is the model that, you know, I think that was the last, I would say, a few years has been perfected by a number of companies, in terms of how you build an open source project and also monetize that. Yep.

Eric Dodds 07:48
Yeah. Love it and do a quick refresher for us, because like you said, orchestration is a very loaded term. And when we think about the world of data, you’re thinking about pipelines, right. And so event driven orchestration, you know, or event driven applications, you tend to think about pipeline jobs starting or completing or failing. But conductor really, you know, includes pipelines, but it really encompasses sort of any microservice. And so just to help us understand, that’s a much, much wider scope. You know, of orchestration.

Viren Baraiya 08:27
That is correct, yeah, like, when you think about data pipelines, you know, data pipeline tend to be in a lot more of a kind of, I would say, a coarse grained, right, you have kind of pipelines running on a daily basis, every step in the pipeline runs for sometimes hours and hours at the end. And then there’s an extreme end of that, where you have, you know, micro services, or even orchestration, where every step completes in milliseconds, and then you are running millions and millions of them every day. And, you know, the audience is very different in terms of, you know, who is writing those things, right, on one header, you have data engineers focusing on data pipelines. And the important thing, there is also the dependency management, right? Like, in fact, data pipelines or a lot more, are kind of really event driven systems because, you know, you start a pipeline, when something happens, right, a file arrives, or some job completes, and whatever not, we don’t really think about it that way. But that’s really the essence of it. On the other hand, it’s very similar as well. And what’s interesting is somewhere in between comes this process orchestration, which is a lot more kind of human centric, where you also have human actors inside the process, you know, taking different actions and very good examples here are like, you know, the approval processes, you know, with various use cases, right? Loan application approval is a classic example where somebody can review it, right? Yeah, sure. Right. So, and, like, you know, that’s in terms of, you know, what kind of systems you’re building, but then like, you know, orchestration for, you know, the end user roles means different things like for data practitioners, of course, is data pipelines. for software engineers, it’s more about micro services and events. But when you start to go outside of the boundary of just engineering, right, when it comes to a product, it’s about how my product is built. You know? And what are the nuts and bolts? What are the optimization opportunities that I have? A very good example, I would say is, let’s say in a supply chain. If you look at your, as a product manager, if I look at my process and see, you know, how long does it take for somebody to place an order and until the order arrives at doorstep, if it takes three days? What are the steps? How long does it take them, if I want to cut it down from three days to two days? Where should I optimize? Like, where is that process? For them? It’s a totally different way of thinking about it, right? And if I’m the support, I want to know what’s going on. And you know, how can I fix things? You know, what else? I don’t care about how it’s built? And what’s the use case and things like that. So, I think and then, I think on the extreme end of that, like as we go closer to the metal, you have orchestration of the infrastructure. Kubernetes is an orchestration engineer and tightly orchestrates your components. Sure. So I think when people think about it, depending upon what personnel and what head they are very, you know, they think about it differently. Connector is to say it’s a very broad, as a matter of fact, when we build orders, our entire orchestra decK runs on conductor, which means our CI CD, our deployment, our entire cloud provisioning infrastructure on some conductor, so we use it as an infrastructure orchestration to process orchestration, we run our customer service on conductor, we run our stand up boards on conductor, we run our AI bots on conductor, like basically dapu. Wow,

Eric Dodds 11:39
that’s it. What a cool opportunity to, to dog food, your own product and sort of your entire company infrastructures? Well, I’m interested to know, I want us to dig into the tech aspect of this on the show and just hear about what you’ve been building, right, because it’s been a year and a half. But I’m interested to know, just, you know, on a personal level, you’ve tackled some gigantic engineering projects, obviously, you know, conductor still has a lasting legacy. And you’re a big part of that. What has the transition been like going from being an engineering leader inside of these really large organizations with really gnarly engineering problems? To be a founder? And, you know, a couple of years ago, starting out with just a couple of people and building? Yeah, I

Viren Baraiya 12:27
think that’s an interesting, you know, interesting question, like, when you think about, like, you know, working at a large organizations, right, like, and, you know, Netflix, Google, for example, where, you know, as an engineering leader, you know, you are lot one is like, you know, you are part of a really big machine. So, you know, there are a few things that you never have to worry about things like, for example, you know, how much is it going to cost, like, a good example is, you know, when we ran predictions, engineering at Google, the amount of resources it took was insane, like, data center to run those things, right, because it’s processing data from internet, which is kind of huge. So, you know, you go from that level of things, right? Like, and when you think about you, even the numbers, right, either the revenue or the users you talk about in hundreds of millions and billions? Yeah, it’s your kind of denominator, right? Like when you said three means 3 billion and not 3 million or not? Three 300 million. And then you go from there, to being a startup founder, you know, you have to now think about everything, right? Like, you know, is it gonna cost $200 to rent or three are asking, Where can I optimize? So you know, cost is one part. But more importantly, now, you know, there is, there’s no support system, you are a support system, you are at the end of the chain, right, there is nobody to complain to. Which means, you know, you are not only responsible for engineering decisions, but also business decisions, company strategy, you are an engineering leader, you are a product leader, you are also an HR person, right, at least in the early days. And how you build your product, how you build, your team also has a lasting impact on how your company is going to grow. Because, you know, if you invest in drone tech or wrong people, that’s not going to kind of go out very well. Right. So, there is definitely kind of that major shift, in terms of you know, how things are kind of Mo at the same time, like, you know, when you look at, like big companies, when you’re working for it, there is kind of a cushion, right? What’s the worst that can happen when you take on a project, the project can, you know, take longer to complete, it can fail. It does not necessarily materially impact you. Yeah, different companies, different companies fail, you fail, and there’s a lot more at stake. Right. So the stakes really go up quite a bit, right?

Eric Dodds 14:35
You have employees, you know, who are aligned.

Viren Baraiya 14:39
And now suddenly, you have to think about how, you know, you feel 20 people in your company now, you know, and if you make a wrong decision, you’re going to impact their livelihood. And basically you have no impact on your families. Have to be very thoughtful about what you know, what you do, how you do things. And this has nothing to do with the business. It is purely just people right? So you have to also Think about the people aspect of building a company, running a company. And it’s very different from a lot of learning experiences. Early days, we were also the business development people, we were also doing sales. I had no idea how sales work. If somebody asked me, What’s the sales cycle? I didn’t know what it meant. I can tell you today, what is my sales cycle? Right, I can talk to and interview sales leaders. Yeah, you know, you end up learning a lot more. So I think that’s an interesting journey as well.

Eric Dodds 15:34
Yeah, you’re obviously the CTO and so your job is deeply technical. What, in terms of the non technical aspects of your job as a founder or leader, you know, wearing multiple hats? Which aspect of your job? Do you like it the most from a non technical standpoint?

Viren Baraiya 15:49
I would say, in terms of non technical understanding, you know, how do you kind of, say, like, we are an enterprise SaaS company, right. So, learning, and like, you know, understanding how to sell to enterprises, is really eye opener, you know, and like, you know, understanding those kinds of processes, you know, how companies operate? Yeah, it has been a very fascinating thing. Because I have always worked in large companies where, like, you know, the purchasing decisions are always made by the purchasing team, you never directly deal with them, you are on the consumer side. But still, you’re on the other side. And like, you know, understanding those nuances is very interesting. The other thing that I really love is, you know, we are a company founded on the foundation’s open source site. So I think, getting out communities. And working with the community, it’s not very technical, but it is also deeply satisfying, because when you see people commenting good things about the product, you know, adopting that, even if they don’t have to be your paid customers. But you know, it’s deeply satisfying from that, and that, like, you know, what you did, like, you know, definitely helps people. And, you know, so yeah, those are two aspects that I really enjoy.

Eric Dodds 17:04
That’s great. Well, congratulations. I mean, it sounds like it’s been an incredible journey, and learning experience, let’s start to focus a little bit more on the technical side. I know one of the things that you and Costas are excited to talk about is, you know, AI, and how orchestration and AI fit together. And you have some really interesting thoughts about software development. So as a preamble to that, what I want to ask about is the perspective that you’re bringing to that, from your time at Google, so that Google, you worked on a product that allowed people to take advantage of Google’s machine learning infrastructure. So, you know, a product that made predictions. You know, and you sort of were working on that, you know, sort of maybe it seems like, right, right up a little bit to the edge of, you know, this massive explosion in, you know, the AI craze driven by large MLMs. But what perspective do you bring to AI based on your experience at Google and actually building a product around that? Yeah,

Viren Baraiya 18:16
I would say, you know, when you think about, like, you know, working on AI or machine learning, there are two aspects to it, right? Like one. Either you are a deeply technical person, kind of researching models, or kind of working on, you know, foundational frameworks, like TensorFlow, for example, right? Or working on the hardware side, and building chips. What was interesting about my time at Google was like, you know, our focus was how do we democratize AI for an average developer, job, and kind of whose primary thing is like, you know, I’m building an app. And my app kind of sustains itself through either ads, or kind of, you know, in app purchases, and subscription management. And, you know, when you look at companies like Netflix, right, who actually pioneered, you know, how do you kind of create a higher level of user engagement? Through a B testing, and personalization? How do you kind of make the same kind of technology available to an average developer who does not have that kind of resources? And it’s not even possible because they don’t even have that kind of data available to begin with? Yeah. And the challenge, there was, like, you know, how do we kind of take, let’s say, you, you are building an app. And this is a simple game with maybe say, 10,000 users who are playing the game. Now 10,000 data points is probably not sufficient to train a sufficiently large, you know, largely accurate model. So you know, how do you solve that problem? And the way we thought about it was like, you know, yes, you are, you have 10,000 users. But if you look at the internet as a whole, there are probably four to 5 billion apps in the world from which you can train a model and that’s large enough inventory is a lot more than large data of data structure during a federated model. So basically, what a port opening I did, right today, we kind of took the same approach that like, you know, there’s all this data coming in, right? Can we get insights, we can be informed, and you know, figure out the user personas, and essentially make it available as a service to developers. So that now instead of me trying to kind of either, you know, invest, if I’m a company like Unity, maybe I can do that. But you know, small time developers or even an average developer, you don’t have to think about it, I can say, you know, hey, this is a user, you know, tell me likely you would have this person making a purchase or clicking on an ad or, you know, staying engaged in my app, you know, and if it is, so we actually do two things. So the first one was this, you know, like not telling me the likelihood of the person doing next. So that’s like, and the way we thought about it was that that was good enough, initially, that let developers make a decision. Now, if you think about it from a developer’s perspective, what decision are you going to make, you know, most likely, you’re gonna flip a coin and say, I’m gonna try something. So you know, that’s basically 50% chance in terms of you’re going to be right or wrong. So we started working on the second part of that to say, you know, can we now optimize these things? Right, so we essentially launched later on, and optimization as a service, as part of Firebase, which is still there, I think, in production, is, you know, you tell your objective, you want to increase engagement, spend, or get them to renew the subscription? And we’ll figure out what are the right, you know, kind of experiences that you should deliver them and you tell us what kind of experiences you can deliver ABCD, and then we’ll find out the right user bucket. So that was kind of what we did with AI and machine learning, right? And I think the whole thing around democratizing AI, right? It has become now commonplace. But those are kind of the days, where, you know, Google kind of pioneered some of those things.

Kostas Pardalis 21:55
Yeah, so we’re in about a year and a half ago, we were, again, like chopping, and we’re talking about like microservices of the scale of Netflix sites, and what it means like to have all these and why they are needed and why we needs like software license, like conductor like to do that. What has changed since then? Because the reason I’m asking is because like, I think one of them. And I’d love to hear your opinion, personal opinion, like that too, and experience. But I think one of the things that happens with people is that, you know, they work in these very unique environments, right, like Netflix or Google will have unique problems in terms of scale. But also, like unique, as you said, the resources and unique talent, right? Like the talent that we find in these companies is rare. But when you go out to the market, and you build a company, and you try to bring, let’s say, all these innovations from these companies to the rest of the market, you start experiencing differences, right? Like the rest of the world is not the replication of like Google and Netflix. Right. So my question is about what you have experienced through these one and a half years, like working with the market out there? And what are the differences between an organization like Netflix, and Google and the rest of the market? And what has changed? That’s the second question about the product, or the technology, from purely talking about micro service orchestration to what orkes can do today, right? And what’s the link between the two? Because when you build companies, obviously, you react to the market signal, right? That’s why I’m trying to bring these two questions together. Absolutely.

Viren Baraiya 23:54
So I think the first question is a great one. And that was a very insightful thing as a founder, also to kind of learn, you know, my history has been like an overtake Netflix, Google, Goldman Sachs. So you know, very tech forward companies, and you get to work with a group of talent, right? And when you build a product for companies like this, you know, you help a certain user person. I might, right, these are my developers, this is how they work. And then you try to bring that market, right. So one kind of thing was that like, you know, when you look at the developer site, there is what I would like to call the 99%. Developer, right? And the 1% I think, when you think about tech companies, don’t tend to be more than one person developer. They like hard problems, they like to solve big challenges. And then think about distributed systems and everything. When you look at the rest of the world, let’s say if you go to, let’s say General Electric, right, GE or some traditional company, the focus there is I have this feature to the belt, I have this product to be launched, and this is my timeline. How can I get there fast, right? There’s less thinking about that. Also, you know, not everybody can pay Google Netflix level seven Isn’t there’s not that kind of talent available also in the market, right, which means, you know, you’re also working with very different levels, you know, you have a very engineer engineers, you have not principal engineers, but you know, the rest of them. So one thing that we quickly realized was that, you know, for a product to be successful, it’s very important for the product to appeal to that 99% developer group, who is not interested in solving distributed systems, very difficult, hard, NP hard problems, they’re interested in solving their current problem, which means usability is paramount. We think about usability like, when you build an app or a site, right, people, a lot of times don’t think about developer experience, but you know, developer experience is paramount. I, in my personal opinion, I don’t think we are 100%. There, we constantly keep on, you know, improving, and try to kind of work with very junior sometimes, you know, freshman developers to kind of figure out like, oh, where can we improve this? But that was the number one thing, right, that what matters to them is very different from what we can have initially built right? In their skill sets. And, you know, where you should focus is an interesting insight.

Kostas Pardalis 26:06
Yeah, it makes total sense. And I think the original nature log was also like my experience. And I actually, I would add to that, it’s just a matter of like, quality of talent. It’s also that when you’re talking about like, General Electric, like the core competence of like, the company is not distributed systems, they don’t care about, they shouldn’t care about that. Right? Like, that’s not their thing. Like the same thing with like, also, like Bank of America, like all, all the companies out there that are obviously, like, fortune 100 type of firms, because they do something really well. That’s right, for sure. Not how I like to build data centers, distributed systems provide technique. Yeah, exactly.

Viren Baraiya 26:43
Like, you know, for example, like, you know, you know, Bank of America would probably want to spend their efforts and energy on making sure that banking is a first class thing, right? It’s rock solid, it is the best in class, not how do I solve the industry? But like, that’s not their core competency. There’s no point investing in that, right? It doesn’t make any sense from the business perspective. Companies like Google or Netflix, they’re all about tech. Tech is what drives those companies. So the different

Kostas Pardalis 27:06
perspective, yeah, 100%. So, okay, we you mentioned, like developer experience and like, different, let’s say, prioritization in terms of like, what features are, like more important, or what value is perceived by the user, right? Like, what is the value for the user? What has changed in terms of like, let’s say the use cases, because we started and we were talking back then, again, a year and a half ago about orchestrating microservices. But what are the use cases today? Right, dominant?

Viren Baraiya 27:42
I think it has definitely evolved. And in some of the surprising ways as well. So you know, microservices were one thing. I think today, when I look at even the current set of use cases predominant, are more event driven kinds of orchestration, you know, service workers. You know, micro services also kind of got a little bit of kind of negative press recently, right with, you know, the blots and everything, not necessarily everything is kind of everything is in the right context. But like, you know, you sometimes don’t need micro service and lack an HTTP or gRPC endpoint for every problem. And in every solution, service workers are actually much more lightweight, and probably better in terms of infra and speed of development and deployment and everything, right. So that’s one area where I’ve seen a lot more kinds of usage of conductors and how people are using it, they think less about. And also like, you know, instead of saying that, like, you know, every deployment is one microservice, you know, sometimes your deployment with which almost looks like a monolith. But then you have different components talking to each other asynchronously and may therefore it is still not this monolith. But rather we can have an event driven system. So that’s one area where we have seen another surprising set of use cases that I’ve seen is how do you build user experiences. And this has come to me as a complete surprise is, you know, there are times where, you know, you want to drive your user experience based on various different parameters, and you want to make it dynamic. And traditionally, there is one way to do that is that you encoded the entire logic in your UI application on your mobile app. Mobile App, actually, was one of the trailblazers in that area, because you know, UI is very straightforward. You can deploy a new version, and everybody gets mobile apps, people who download. So developers are already kind of doing that by using things like remote config, or round LaunchDarkly, where they are driving user experiences based on the feature flags on the server side. But now I have started to see those things happening with connectors as well, where you have a UI flow designed in the conductor and then the UI is driven based on that flow. And then you know, a product manager is changing the flow based on the experiences they want to drive and things like that. That was a very surprising use case. But I think when I think about it, it makes a lot of sense. Yeah.

Kostas Pardalis 29:58
Okay, so That’s super interesting. But who is the user here? We talked about product people, we talked about developers, obviously, we’re talking about application development here. But occasion development is a complex thing, right? Like, involves a lot of different models. And they’re, like, different types of developers, like, front end developers to, you know, even like DBAs of the end like managing like, they’re easy. So who is the user? Who gets, let’s say, the most value, it’s like, from a system like mortgage,

Viren Baraiya 30:30
I think I’m a software engineer, right? Like the developer working on the back end or front end, I think, in the end, this persona that we kind of build a product for, everybody else gets benefit out of it. But that’s not intentional, it’s more of a byproduct. But in the end, you know, it helps a developer, like if I’m using something like a conductor, now, I don’t have to think about handling error cases and you know, resiliency parameters, and all of those things. I can just think about building stateless screens and you know, orchestrate that separately. So is it still a part of their life that becomes much easier to monitor?

Kostas Pardalis 31:06
And do you see like, more of like, the front end developers being, let’s say, the owners of that, or is the back end developer? And how do they work together? Right? Because that’s always like, yes, there are phases between like, developers are always like a very interesting topic and a hard problem to solve in general. So

Viren Baraiya 31:26
I agree, I think, I don’t think he’ll see no solution, either. I mean, today, predominantly, our developers are mostly backhand engineers, the way I see, yeah, are people that we interact with front end apps. Again, it’s still a very small percentage, it seems to be growing. But I would say right now, it’s mostly back end engineers. How do we work together? I think systems like connectors, typically, what I’ve seen backend developers doing is that, you know, they build out the API using a conductor, and mock out the data, and then you know, the front end can keep working on it. And then they slowly kind of implement stuff that, you know, starts to give you real data as opposed to mock data. We have at least a couple of customers that have kind of adopted that kind of strategy, and have been kind of pretty good at that. Yeah. Yeah.

Kostas Pardalis 32:12
Well, that makes all the sense. And, okay, there is like, there is a new wave of, let’s say, transactional systems out there, right, like, there is this attempt, especially after, I would say, like Heroku, went out of markets, to, in a way build on the legacy of Heroku. Because Heroku, I think, at the end was just too early in the market, but they had, like, some amazing ideas there. Right? Like, I think the legacy of like, caribou will, will leave, and will drive a lot of innovation. Now that like the time, like the market is, like more mature for, for this kind of product. So we see, like a lot of conversations about these new types of, let’s say back end systems that are kind of a fusion between a database system together with an orchestration system. And okay, like, in some cases, like some other stuff, too, but I’ll focus on these two, because I think the main conversation is like how you mix workloads together with transaction boundaries there? And what does this mean, in terms of managing infrastructure and building? Building applications? Right. So what do you think about that? And how do you see it, what’s your opinion? And what do you see is going to happen at the end?

Viren Baraiya 33:38
I think that’s, I mean, yeah, you’re right, like, you know, the kind of void that Heroku left, there has been kind of some attempt to kind of fill that gap. And I think, conceptually, if you think about right, like mixing databases with workflows is an interesting concept. Like, I think back in the day, we used to have stored procedures, which will do kind of almost the same thing. And it worked well, when like, you know, you had all of your data and, you know, business logic inside one database, right? What has changed now is with this new kind of databases and systems is, it refers to proc in PL SQL, you’re writing JavaScript code, to which you can have the same thing and you’re working with more like a no SQL database, like a document or a database. So that’s an interesting concept, I think. And in some ways, kind of Firebase also did similar stuff with, you know, a combination of Firebase database and triggers that, you know, executed Firebase functions to, you know, do a lot more stuff, right. And there is definitely a need for a group of developers. You know, I would say a lot of app developers, for example, who need a back end because you know, they are very good at building mobile apps, for example, or building the front end experiences, but to drive the data and business process, they need some back end. Either you kind of have another team that is focusing on back end development, which may not be possible if you are kind of For a singular developer, or you know, a small team of developers focusing on just building a game, right, I focus on game experience, then building a back end. So this is where I, I see this kind of a value where like, you know, your processes are relatively simple, when you want to insert a record, you want to run some small process, or business logic that drives a bit of a workflow. And you have a single source of truth for the data. So I think there is definitely a place for it. One, my experience is that Firebase also told me that like, you know, that quickly, it becomes very good for prototyping stuff. Yeah. But almost 100% of our customers did that, you know, they did use Firebase. And the moment they get more production ready, they move out into work, or, you know, MongoDB, or something like that, or Cassandra, for example, the products see me and then they can have invested into more proper server less or, you know, container based systems and whatever not. So I think that’s how I see those systems that are very good for kind of getting your prototype up and running without having to have a full fledged back end. Yep. My question is,

Kostas Pardalis 36:06
It’s like, it’s not from personal experience, to be honest, but more of my experience, like my overall experience, like an engineer, like the value of having, say, an external orchestrator. And again, my, let’s say, experience comes more from, let’s say, the data infrastructure where things tend to run much longer. So the possibility of something breaking varies like higher, right? So having an external orchestrator is like in case of a fault happening, you have a different system that can take control, right? Like, if your database fails, let’s say, then your orchestrator can execute logic about how to manage the failure. Yeah, right. But when you put the processes together, things get a little bit, like, more weird there, right? So is there and that’s like a purely, let’s say, like, engineering question I like and I am asking you like as like, okay, like one of like, the most experienced engineer that they know, we have, like, these architectures, is there like a wave of like, we can guarantee that if we put together let’s say, like, transactional database with an orchestration system, that when something will go wrong with the database, for whatever reason, the other process that it’s like responsible for the orchestration is going like to remain fault tolerant, and like, do what he’s supposed to be doing. Yeah,

Viren Baraiya 37:37
and, I mean, that’s exactly the purpose of having an orchestrator, right, and especially the next generation of orchestration, like conductor, for example, which are basically not a single point of failure, they are more distributed, right. So like, you know, gives you much higher availability. And, you know, you are also kind of risking and decoupling yourself from a single database. It does kind of help you to Wayside. One is, if your database goes down, you can operate on a cache of data, and apply circuit breakers and things like that. So, you know, depending upon what kind of user experience you want to drive, you know, then it becomes kind of real, right? You kind of avoid, at the same time also, like, you know, you can also do things like, especially in a read only scenario, you can also do things like hedging, you know, send a request to multiple databases, and make sure that, like, you know, you are able to serve it. And also it kind of now, suddenly it opens up an opportunity for separating it out your local transactions with a global kind of knowledge. Like, if you think about it, let’s take an example of a payment processing system, when I want to transact, and do kind of, let’s say, I want to send an ACH through the federal system, I want that system to be transactional. So you know, that particular service could have a local database that maintains a transaction and operates in a global key. But globally, you know, you could have other systems also, like, you know, sending out an email, which is like, I get two emails, you know, does anybody care as long as I get two emails? I get an email, right plus ones. So now I suddenly you know, there’s, there is also you can, there’s an opportunity to kind of optimize and like, you know, make systems a lot more decoupled. an orchestrator essentially maintains your state right overall, and then you can change it however you wish to, depending upon your house process change. So not only it gives you resiliency but also gives you flexibility in terms of you know, how we can do things?

Kostas Pardalis 39:30
Yeah, 100%. That makes sense. Cool. So let’s move a little bit like to AI because I think, if anything about like AI is not happened is that somehow everyone’s like becoming some kind of trying to figure out like, a new type of orchestrator there because at the end, what we have here is that we have like a system that it’s not reliable by definition. So What we used to do was the exception with distributed systems, for example, where we had provision for when something goes wrong with the orchestrator. Now it’s pretty much like the opposite. Like, every time you get a response, I mean, on a somatic level, like, it’s not like an API error. But you need to ensure that at the end, you get what you’re looking for, which is, in a way, it’s like managing faults at the end, right? Like, it’s not that different, like, from an engineering and like, design perspective, right. So what I’ve noticed is that like, okay, like systems like alarm chains, like all that stuff that we see out there. In the end, they’re specialized like orchestration systems. What’s going on without like, how many different flavors of orchestrators we will have because it starts becoming like a really hard thing like to talk about that stuff. Like, it’s almost funny, like, if you talk with a distributed person, and you say, orchestrator, something almost completely different compared to what orchestrator means for like the data engineer. And if we talk about someone who builds applications with Llh, again, something completely different. So it’s very interesting, like what’s going on there with a definition show? Let me know. Tell me how you think about that, and what you see actually happening out there. So

Viren Baraiya 41:23
yeah, I think MLMs are interesting, right? Because certainly, you know, because MLMs are inherently kind of asynchronous, very latency, high latency system sites. And you know, this is one thing. By just excluding a problem, we can’t build a system here that also chains other things, right? And that’s where kind of not the orchestrations and workflow engines kind of became a lot more important when you want to build applications that leverage LLM. There is one more problem with LLM, which is when you think about building an application, using a traditional way, is almost pre guessing or traditional, right? Is through API’s. They are deterministic. If I send a simple query, I know what I’m expecting, LOL Ahimsa, non deterministic. So you also have to handle non deterministic aspects. And the moment you put non determinism in your application flow. Now you have other things to worry about, which is compliance and security. And that’s like, and I’ve seen this now, you know, companies want to use algorithms, but they are very worried about the aspects of compliance, security and you know, reputation damage if something goes wrong. So now you also have to kind of put some guardrails on top of it, like, you know how, and the guardrails can come in the way of leveraging another LLM to do some sort of adversarial kind of validation of the output. And for very high, least sensitive systems, maybe also humans, right, who can actually review and validate whether this makes sense or not. But all of this requires orchestration, it requires you to build flows, which are kind of very flexible, and can change. And if you want to kind of run this in an environment, you also need a distributed system, because you know, now, you know, everything could be running differently, right? Your LM is running on open AI, or Azure or Google wherever, right, and your systems are running somewhere else. And then came an interesting mix with better databases, we certainly allow, you know, you have this retrieval augmentation generation where now you also look up a vector DB with a namespace. How do you protect that? Again, same set of problems. So I think, as you say, right, like, it’s certainly you know, now a different class of what questions are coming up, focusing on just purely prompt genes. But cheering alone doesn’t get you an application deployed, you also need to make an API call and look up a database and process and invite humans and everything right. So that’s where I think long term, everything is going to converge into, you know, probably one or two orchestrators, which can do all of these things and do it very well at scale.

Kostas Pardalis 43:54
Yeah, does that make sense? Also sense. Eric’s? I have a feeling you might sound like more AI related questions. So I want to give you time to do that.

Eric Dodds 44:03
Well, I think we’re, I think we’re fairly close to the buzzer year. But you know, via and one of the questions I have is, I think it’s, you have a very interesting perspective on AI in general, obviously, there’s a ton of hype there a lot of statistics thrown around. A lot of clickbait, you know, article titles that you’ve built, you’ve built, you know, products like this insight of companies like Google with vast, you know, let’s just call it unlimited access to data and unlimited access to compute resources. How can people separate the hype from what is real? And I think that even for people who are very technical, it’s easy for us to see the immediate sort of practical benefit of, you know, being able to draft a blog article or, you know, have, you know, even get support on a SQL query? You know, the best way to optimize it, right? I mean, that sort of very synchronous feedback on specific problems that would normally take a long time to research, you know, through some traditional search methods. I think everyone’s like, Okay, this is great, right? I don’t want to go back to the previous world. But when you talk about, okay, I want to use an LLM to predict the next best action for this user based on all these inputs. Yeah, well, even though you don’t have to build the model that recommends the next best action, right, which is a huge step forward, that is still phenomenally difficult to get right. And so there’s this huge gap, how big is that gap? Is it? Or maybe I’m over interpreting it.

Viren Baraiya 45:54
I mean, I mean, definitely, there is a gap, in my opinion. And definitely the gap also is shortening, because, you know, there’s a lot of investment that is happening, both in terms of research, people money, infrastructure, that is kind of going into it, right. But the way I see is like, you know, if you look at it, the current state of the world, right, there’s a lot of kind of investment happening on to the foundational models, right, like started with open AI, but now you have a plethora of models of them are completely open source like llama two. So there is one aspect of that, which is essentially democratizing. The foundational models, like foundational models are going to become like Postgres, everybody has access to it, and everybody can use it. The question is, how do you use it? And what do you build out of it? Right? So when the question comes, it is like, you know, okay, I have a very powerful model that can, you know, do a lot of interesting things, but how do I actually use it to solve my business use case. And I think that’s where today the gap is, you know, to be able to save, because as a type like LLM sustained non deterministic and they lack the consistency that you know, we are used to having kind of a normal system. Yeah, bridging that gap is kind of where I think there is a need to kind of do that. So that Mike you know, as an enterprise’s company, I can say, you know, I can safely incorporate this LLM into my, you know, flows, and kind of make use of it, right. And I think the way things are also probably going to go forward is it also is changing the way we interact with the system site today, you know, everything has to have a UI, and you know, you have actions and buttons and everything. A lot of times, you can auto chat based interfaces, right. So like assistants, are probably going to become a lot more commonplace. But that also brings in an interesting question in terms of how do you actually build those things like this one area where we are also focusing on? Yeah, and hopefully, by the time this is published, we will have something really exciting there.

Eric Dodds 47:49
Very cool. Very cool. What do you think of the companies that are going to thrive in this environment? are the ones that help fill that gap?

Viren Baraiya 47:58
I think so. I think so. I think so. Because like, there is going to be like, you know, I think it’s almost like a supply chain, right? Like at the bottom of the supply chain are the hardware manufacturers, and then you have operational models. And the end of that is the kind of end user building an application, but in between, there is nobody right now. Yeah, there’s a lot less members going to happen. And I think that’s where the big opportunity is,

Eric Dodds 48:21
yeah, for sure. Because I think that there is this interesting dynamic right now of essentially wrappers on an LLM, that’s just a UI. Yeah, and I mean, there will certainly be some companies that make progress there, because there are specific needs, but I also think there’s going to be a ton of companies that fail, because, you know, the, you know, the open eyes of the world are just going to productize, all of that, right, and just, you know, completely take that business there. I mean, Google kind of, you know, is notorious for doing that sort of thing. So, I’m interested to know, when you think about at the enterprise level, and especially as you think about orkes, you know, we’re talking about sort of incorporating an LLM into a much larger system. But there’s also software providers. So let’s just, you know, we’re, you know, earlier talking about sending emails, you know, and things like that. There are also a lot of software providers who are packaging an LLM, and then essentially, you know, sort of building functionality on top of that within their own software, and then, you know, sort of reselling that packaged functionality. Yep. Where, what are the limitations there? How do you think about, you know, incorporating an LM in a bespoke way, you know, at the enterprise level? Yep. versus, you know, sort of buying it as part of a package software suite? Yeah,

Viren Baraiya 49:44
I think the former is more about like, you know, the, how do you enhance a product using Erlang, right. I want to send out an email. And if you notice, Gmail had that feature for a long time. Very good. Autocomplete your sentences you And so you know, that kind of gives you the personal productivity and like, you know, you enhance your product using LLM copilot is a good example, right, I don’t have to necessarily fish for the same amount of code, I can just autocomplete everything IntelliJ has been doing that for a while. So that’s one area where, you know, there is kind of the application of MLMs into very bespoke products. Enterprise applications are very custom, like, you know, they are rapidly changing, they always change like, usually, you know, you build an application, the kind of the lifecycle of an application, or a lifespan of an app is probably two years or max. Theory again, and then you need basically tooling to be able to kind of leverage an LLM to kind of, you know, redo the same stuff, or put it in a different way. Right. So then I think the other class of application is where you leverage an LLM to build those bespoke experiences, but that you are kind of using internally or, you know, selling it to another customer. But that’s the kind of other aspect of it right. And to me, that’s a lot more interesting, because instead of a very fixed problem, you know, these problems are very dynamic. They are changing, and you know, they are very different from company to company.

Eric Dodds 51:09
Yep. Yeah. That’s super interesting. Okay. Last question. Last question, which this is always, this is always a fun one. Is there anything that worries you about all of this new technology around AI?

Viren Baraiya 51:25
I don’t think so. Like, wasn’t the same thing when people were talking about when the computers came that computers are going to be like an Oculus? it created more jobs, right? Same thing happened, right? I think there is. I don’t have a specialization in, you know, how people think about things. But I think it’s natural, right? Like we are always a little bit wary of something that is going to come in and you know, take over jobs and everything. I think developers are getting more productive because I don’t have to always go to Stack Overflow to search for my ID. I can do this, I can probably do a lot more. And I can probably have a lot more quality time for myself. Businesses can move faster. Maybe they can do more stuff. So I have more work. So in the end, I think everybody’s gonna benefit. Yeah,

Eric Dodds 52:04
I agree. I agree. Yeah. Well, we’re in, this has been such a wonderful time on the show. What a year and a half. Congratulations on all the progress. Congrats on August. Congrats on, you know, the fork of the conductor and the foundation around that. Just so impressed with everything you’re doing. And we’ll keep cheering you on from the sidelines. Yeah,

Viren Baraiya 52:25
Absolutely. Thank you for having me again. One more time.

Eric Dodds 52:29
We hope you enjoyed this episode of The Data Stack Show. Be sure to subscribe to your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me, Eric Dodds, at eric@datastackshow.com. That’s E-R-I-C at datastackshow.com. The show is brought to you by RudderStack, the CDP for developers. Learn how to build a CDP on your data warehouse at RudderStack.com.

🎙 Sign up for The Future of Machine Learning Livestream!

🗞️ Signup for Our Newsletter

Episode 176:

The Fundamentals of Event-Driven Orchestration and How Generative AI Is Shaping Its Future with Viren Baraiya of orkes.io

February 7, 2024

Notes:

Transcription:

About the Podcast

Sign Up for The Data Stack Show Newsletter