In this episode of The Data Stack Show, Kostas Pardalis and Eric Dodds talk change data capture (CDC) with DeVaris Brown, co-founder and CEO of Meroxa. Their conversation digs into the benefits of utilizing CDC and how Meroxa is using it. Highlights from the conversation include:
- Introduction to DeVaris and Meroxa (3:24)
- Why CDC has more traction today (6:58)
- How CDC is changing the way we build products (12:52)
- Where CDC is playing an important role (21:11)
- The experience that Meroxa delivers (24:42)
- Looking at Meroxa’s sources, technology and data stack (27:28)
- DeVaris’ vision for the company (37:10)
DeVaris Brown and Meroxa
“If data is the new oil, we want to power the refinery.” That’s the slogan that DeVaris Brown applies to Meroxa, the company he co-founded with Ali Hamidi, and it’s fitting because the name is derived from the process for refining impurities from jet fuel.
Launched in 2019, Meroxa is positioned as a company that provides access to a real-time data infrastructure for anyone, not just for mega corporations with unlimited resources.
DeVaris, a former Microsoft engineer who worked on Hotmail (and still gets a kick out of seeing someone with a Hotmail address today), also worked at Zendesk and then was a product manager at Heroku where he built features for developer experience. It was there that he and Ali got the inspiration to launch their own venture, Meroxa.
It’s worth noting that this young company is hiring and has been very intentional about building an inclusive team. DeVaris mentioned the goal of hiring at least 25 percent women, providing a work-life balance with an unlimited vacation package, a remote-first work setup and paying employees well.
Meroxa’s clientele are data engineers and dataware engineers from smaller organizations. “We’re decidedly focused on the smaller folks so that we can build a community around real-time streaming,” he said. “We saw the power of empowering the one-or-two person team and making that experience super easy to build this infrastructure and not have to think about it.”
DeVaris went on to paint the picture of a hypothetical company debating spending the salary cost of a few engineers, spending millions of dollars and about a year to build up an off the shelf solution, or picking up Meroxa and having access to data that bigger companies are using all as fast as you can type a command.
Meroxa is planning on going open source soon with its technology, and for them, their IP isn’t in the components themselves, it’s in the “puppet-mastery”, how those things are stitched together and operationalized, the scaling and the maintenance. DeVaris supports open source as a way to democratize access and education.
When asked about the data sources supported by Meroxa, DeVaris pointed to a variety of sources: turning API endpoints into a stream, Cassandra, Postgre, MySQL, Mongo, Oracle, pointing to a URL, Kafka streams, S3, GCP, Snowflake, AWS tools and much more.
With co-founder Ali’s expertise in Kafka, DeVaris noted the importance of Kafka to their operation. “Everything talks to Kafka anyway,” he said. “Our duty is to get it into Kafka and out of Kafka. That’s really what we do underneath the hood.”
Change Data Capture
Meroxa utilizes change data capture to analyze data, and is part of the shift away from a focus on capturing events, events that could take weeks or months to build out the plans for. For DeVaris CDC as a pattern makes a lot more sense because it gives users so much more fidelity and granularity of the data going through the system. It addresses the problem of mixing transactional and event stream data and can turn a database into an event stream. One of the biggest keys to CDC is the benefit of seeing real-time changes as opposed to just the end result.
DeVaris provided use cases of CDC for instant data warehouses to provide an accurate reflection of the entire data picture, for platform engineers for real-time compliance, dashboards and search indexing, and for e-commerce to provide real-time inventory. In e-commerce for example, while another tool might provide an outlook in five to thirty minutes, a lot can happen to the inventory in that amount of time, say on Black Friday or Cyber Monday, so the benefits of real-time data provided by CDC become even more apparent.
DeVaris also walked through an example of a casino in Las Vegas that was missing out on eight or nine figures of revenue because they couldn’t get the data they needed in a usable format in real time. Their promotional offer intended for VIP customers was only getting to them in time in less than 10% of cases. “End-users and customers are used to highly relatable and highly personalized content and you need to have that real-time capability,” he said. “You need to know, at this moment, what is the most relevant information that I need to make a decision about what to put in front of a customer that they can act on.”
“This information’s already sitting in your database, you’re just not utilizing it because you don’t know the best way to get it out,” he added. “For us, we provide an easy way for you to get that out initially and on-going so that you can use it however you want to, whether it’s search, recommendations, marketing, automation, data warehousing or analytics.”
With CDC, Meroxa comes alongside your existing system and provides modern functionality. DeVaris best summed up the conversation with “You just point us at a data source and we figure it out.”
The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.