Data Council Week (Ep 5): A Primer on Spatial Data With Gabriel Hidalgo of Carto

April 29, 2022

Welcome to a special series of The Data Stack Show from Data Council Austin. This episode, Eric and Kostas chat with Gabriel Hidalgo, a support engineer at Carto. During the episode, Gabriel explains the difference between location, time, and spatial data, what spatial functions are, and how Carto is used by its customers.

Notes:

Highlights from this week’s conversation include:

  • How Gabriel got into data (1:54)
  • What Carto is (5:28)
  • Location data vs spatial data (6:37)
  • Time data vs space data (7:50)
  • System supports for spatial data (9:50)
  • Explaining “spatial functions” (14:19)
  • Who uses Carto and why (15:52)
  • What’s coming for Carto (19:15)
  • What Gabriel does at Carto (22:22)
  • The coolest things Carto’s done (23:52)

 

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcription:

Eric Dodds 0:05
Welcome to The Data Stack Show. Each week we explore the world of data by talking to the people shaping its future. You’ll learn about new data technology and trends and how data teams and processes are run at top companies. The Data Stack Show is brought to you by RudderStack, the CDP for developers. You can learn more at RudderStack.com.

First of all, Kostas and Brooks, it’s nice to have you back. I was flying blind there for a little bit on the last episode, but the team is back together. And today we are going to talk with Gabriel from Carto, who’s here at Data Council on site and, Kostas, I’m just excited. I don’t know if I have an individual burning question. Other than I just want to learn about the world of location data. This is the first guest we’ve had on the podcast from a company that’s dealing with data of this sort. And so I just have tons of questions. How is this world different? How is this world the same? And so I’m just excited to learn? How about you?

Kostas Pardalis 1:04
Yeah, I have some technical questions, traditionally, I mean, the algorithms and like the way that you approach and you query, geographical and spatial data is a little bit different. And usually database systems need, like, let’s say, like, some extensions on the SQL language itself. And also like, how they represent and how they store the data to do that. So it’s very, I want like, rather something a little bit better how Carto complements the Data Warehouse Solutions in their capabilities for querying this data out there and how they work together. So that’s definitely something that I want to explore this conversation.

Eric Dodds 1:42
All right, well, let’s dig in and talk with Gabriel.

Kostas Pardalis 1:44
Let’s do it.

Eric Dodds 1:48
Gabriel, welcome to The Data Stack Show. We are so excited to talk with you about Carto and learn more about you. So how did you get into data originally?

Gabriel Hidalgo 1:58
Originally, actually, I was abroad in Peace Corps, and I was working on just collecting any data I could in the place I was stationed. And this meant like looking at cars on the side of the street, and just through this, trying to figure out just collecting a spreadsheet, and then getting into more into open source community, because that’s what it’s for, it’s to gonna get people that don’t have all these tools and all these web apps to like, just get the bare minimum to just get what they need to do. And from there, that’s when I started to learn to code and I started to really try to get more into this, like, understanding how to get data, how to make it better, what can I build with it to, like, empower people or make things better?

Eric Dodds 2:35
Yeah, very cool. And just out of curiosity, what were you looking at cars for? Counting cars for? What was the use case in the Peace Corps?

Gabriel Hidalgo 2:43
Yeah, so before this, I was actually an urban planner and I was stationed as urban planner in Albania. And there, we’re trying to figure out what roads need to improve. And to do that you need data to be like, how many cars are navigating through this road? And to get get more like an analysis on this? I think it was funny, cuz everyone’s like, everyone knows which roads get most traveled. Like, it’s pretty small town, everyone knows. But then I’m just like, let’s just check. Why don’t we just check to make sure. And it was actually really interesting. They’re like, Oh, wow, we didn’t know this road gets us this much. Like they started to get more like understanding like, oh, like, we should start keeping track of this. And yeah, like, I helped them out to get them more into that kind of thinking.

Eric Dodds 3:24
Very cool. Okay, so you got more into data after that, and then did you start working in data? Did you go to work for a SaaS company?

Gabriel Hidalgo 3:35
I think funny out. So I do have like, a very variations of career just because I think for me, it’s more about what I want to learn and more what I want to do. So after I came back to New York, the Uber crisis was happening in New York, where a lot of taxi drivers were losing a lot of fares. And so I saw that problem. And I’m like, oh, that’s super interesting, like, what’s going on here? And then I put those data analytic tools do that. And then I got hired at the taxi, Limousine Commission, as I’ll pump data analysts there. And I helped them to build up their data team and to get more of understanding of that. So I guess for me, it’s less on the like, what I’m like, it’s more about what’s happening. And then I really tried to help and try to solve that problem in some way or part of it.

Eric Dodds 4:18
Yeah, fascinating. I love that. It’s almost like this may be the wrong term, but almost like an activist approach to solving problems with data, which is pretty cool.

Gabriel Hidalgo 4:27
Yeah, but just to bring it back to originally how I got into Carto, I was working for an app mobility company called City mapper. And there I became more of like, project manager more like, for all these analysts, and I realized I was missing out on the technical aspect of things. I realized also that I want to work more with city orientated instead of just mobility. I was becoming really specialist. And so I wanted to expand that and get more technical at the same time. So that’s why I think I always knew about Carto like New York. In New York City Kardos, a fairly known product. And I like to know about it. I’ve been to their events. And so I’ve always had an interest in Carto. And actually, I found out that they’re hiring was through, I was actually looking for something presidium ever for my analyst. And I’m like, Oh, why don’t we just try Carto? And I’m like, Oh, they’re hiring too? Why not just try it out? And here I am.

Eric Dodds 5:21
Very cool. Okay, so for our listeners who don’t know what Carto is, what is it?

Gabriel Hidalgo 5:27
So I would say at our core, we try to help you handle spatial data. And I want to keep it this broad is because we provide a UI, they can quickly create a map, like with one click button, upload data, create a map, style it in real-time, but other side of things, we can actually we create our own spatial functions, that you’re able to, like do advanced analytics using spatial data as well. And on top of that, we even provide a back end that you can connect to your cloud warehouses because we are cloud warehouse native, that you can connect it and then build your own apps on top of it like, oh, yeah, I was telling Angular. So I think it’s less about these products more about we deal with spatial data. And we tried to get the most and what do you need from it? And we tried to build those tools around that goal.

Eric Dodds 6:11
Fascinating. Okay, so I’ll hand it to Kostas, one more question. When we were just chatting before we hit record, I said “location data” and I think about maps, right? And you just kind of think about location data, and you corrected me, which I found very helpful. And you said, Well, it’s like location data, sure, but it’s really spatial data, help us understand what the difference between location data and spatial data is.

Gabriel Hidalgo 6:37
I think location data is closely linked with data itself, like from going from A to B. But when you look like that is a type of spatial data, like coordinates, exactly. I think people link with that. But I think spatial data is more like contours as well. Like you have polygons, you have radiuses. And within these radiuses, like how they interact with each other as well, which is also another field of data analytics, as well. So for example, if you have like a polygon, have a country, and you want to see if there’s a volcano going and you want to see who is affected by this, and how many people are affected by this, and you get an influence area around we see the lava going in, because it’s actually an example we’ve done before, it’s actually La Palma example. And then you’re able to see what buildings are the most affected or could possibly be affected by this. Interesting. So it’s less like when you look at that buffer, it’s not a location, it’s a space.

Eric Dodds 7:33
Interesting, right. So it’s, it’s adding dimension, as opposed to sort of just like, lat-long like to lat launch. Right, which is basically a loan fascinating. Okay, it Kostas I could keep going, but—

Kostas Pardalis 7:47
Yeah, my first question is about we see we keep hearing, what a couple of years in law, the pollack time series, right? So we got like, time-series data that we have time-series database? If so, without specialized technology actual like the world with these time limits? Why don’t also hear the same for spatial? Or there are technologies out there that we’d love to know about. But why do you think that there’s so much more would say, interesting, the dining, instead of like, the space music, because it’s harder went to where we mouths the use cases are not real yet? Like, what’s your feeling?

Gabriel Hidalgo 8:28
I think, for time, I would say it’s not really that it’s hard. It’s just there’s more I wanted more variance. I think there’s still a discussion, timestamp vers daytime, and these conversions are like, and you lose a lot of data, which like, with, like, how do I say it like daylight savings, like all these factors have been taken to account and all these data warehouses are trying to figure this out? So on our side of things, it’s more we worry about the spatial side of this. And I just to say, like, I wouldn’t completely agree, disagree, I would disagree a bit that it’s not connected, because I think it mobility data that is essential, like time and space is like that’s like the bloodline of that of those data sets. So I think it’s more on the I think what most people are focusing on at the moment, and I think you’re right time series is not really the thing, because I think usually it deals with real-time data. Right? It’s like you got a point and you want to know, this isn’t happening this day. First. It happens instantly. So with these types of things, it’s less data analytics, I want to say it’s more about like an app that you’re going to be using a real-time and that’s why I don’t think it’s gone that focus because it’s more up mobility, more function-related versus you’re at a company like that. You don’t need that much of like a the time recording for that.

Kostas Pardalis 9:48
100%. How do you feel about the supports the modern data management systems have for spatial data like the data warehouses, for example. Did you see that there are things that are missing there in terms of capabilities? Is that work to be done? How do you feel about that?

Gabriel Hidalgo 10:09
We have great partners there at each of these cloud data warehouses, Snowflake, Redshift, BigQuery, Databricks. And they’re doing a great job with like, the scaling, like, this is a problem that everyone has, and they’re doing a great job about it. And I think in that side of things, like we don’t expect them to handle the genome part of this, like, that’s not their focus. And I think this is where Carto fits in into this, where we’re able to, like bring that spatial function to their data warehouse is like, we actually have what we call the Carto analytic toolbox, where you can actually upload all these spatial functions, all these models into that warehouse, so you can actually run these functions within them. So we’re that layer that they’re missing for that. And so again, we want them to keep using their products, but we want to be that we want to say the spatial go-to, for them to get to rely on.

Kostas Pardalis 11:03
How does this work? Like, what’s the I mean, these are all fobs, right? And at some point, I want to add special capabilities to my insights. Do I like to like to open the data to your system? Yeah, my leverage, and then Gilbarco, like Hopsworks, how’s the product workflow?

Gabriel Hidalgo 11:24
There are two ways to do it. I’ll start with the Carto anular toolbox first, that one is once you’re an account number, you actually contact us, and then we actually send you something to install within Snowflake, and they’ll have it installed on there. And we have a whole our documentation shows how to install that really quickly. All our customers have really been really great about doing it. And they’ve been really vocal, they’ve been wanting to get those functions as soon as possible. Whenever we release them, they’re like, send it over to us as soon as possible like this, we really depend on the early. Yeah, cuz it just makes their life so much easier. Because we’ve had people like I think Snowflake is example of this, that what you said before, like, they’ve installed it in Snowflake, then they bring it out of it, do special functions there and then bring it back in. Or we had a case where actually, a user just had like a notebook of like different SQL functions to do like, very simple spatial. And we cut that down to just like, very simple functions. Like I think buffers are good example of this, where if anyone that’s done a buffer, the distance is like insane to do for like kilometers or specific meters. And so for us, we just made it into a simple like kilometer, you can use meter and just put the number. And that’s it like how it should work. But sequel isn’t that straightforward. On the other side of things to also connect to Snowflake, it’s as easy as making a connection. So we’re actually interacting with these data warehouses. And whenever you upload a data, we’re actually putting a SQL query within there. So you can actually, it’s like running SQL within your own data warehouse, but within Carto. And so, again, we say like, there is no limitation because it’s your data warehouse, we’re not dealing with it, but we’re empowering it. And you can use it under our platform that helps you enable it more like you’re able to like import data straight into your car into your cloud data warehouse. Like that’s like a way that we like are trying to be as we’re trying to improve the service as much as we can to our users.

Kostas Pardalis 13:18
Are the differences between lines and different data warehouses from years of their life is there like things that you can do on Snowflake that you do not do like in BigQuery, for example, great CDs, like how all the different data warehouses feel like?

Gabriel Hidalgo 13:35
This is why we are partners with them, that we’re trying to make that that there is no difference between them, right? Like you can do whatever you do in Snowflake you can do in Databricks. So and that’s when we partner with them to make sure that, of course, they have different systems, how they work, but that’s our job to figure that out and make it as seamless as possible. So when you go into your, in Carto, you just see your connections, and you can use the same functions throughout.

Eric Dodds 14:01
Question for me, and I think some of our listeners when you say spatial functions, could you give a specific example of that. I think I know what that means, but for those of us who aren’t as familiar with sort of spatial data, give an example of a spatial function.

Gabriel Hidalgo 14:18
I think the buffer one is a good example of this or an intersection. So if you have a bunch of points on a map, and then you have a polygon, then you want to be able to like, get whatever points are within that polygon are actually removed the points, right, like you want to be able to, like quickly get these points that are in certain polygons or not, or in certain type certain values. Or you actually want to run like a model regression where you kind of are figuring out like, hey, there’s a point and using math which we can get we can not really get into right now because it’s really done. But you’re able to see oh, it gets influenced by the polygons around it and we In what categories, there’s like a pattern to this. And so then we create genomes based on this on the dataset you give it. So it can get really deep to that sent to as getting a point outside or inside a polygon. Got it? These are the functions.

Eric Dodds 15:13
Yeah, super interesting. Super interesting. How does Carto execute those functions? Those are sequel in the warehouse.

Gabriel Hidalgo 15:23
Yes. And they also call upon our own data warehouse, because again, like we have our own, we’re trying to be as cloud-native as possible we use the same thing. So you call upon our own cloud data warehouses where these functions exist. So you’re calling to us, too.

Eric Dodds 15:37
Oh, sure. Yeah, makes total sense. Sorry, it Kostas I had to interrupt for some clarification.

Kostas Pardalis 15:43
Who is the user? Who like the people that are in there, all the leaves who like using Carto today?

Gabriel Hidalgo 15:52
They’re in two really extreme cases. So I want to say in the one user, it’s, they just want to create a map, like when they just get they have their data set. And they just want to quickly create a map that enables them to just see their points on the map and to share with their users to that sign. So it’s literally we have a one-click button, create a map, and it stylet and then be able to do these, like advanced analytics, but they just click buttons and do a UI. On the other side of things. We have very, very technical users that have servers, they understand they build their own front-end apps. And in that case, we actually help them or we empower them to use Carto in the way that best fits them. And so I think, and it’s similar how Carto was actually shaped, we actually have a group that is very highly more about what does the customer want and your side of things, we have very highly technical people that actually helped build that out. So the same, I think our customer base is kind of in the similar stance where we’re very highly technically. And then also on the other side, like tried to make it as easy as possible.

Kostas Pardalis 16:50
Yeah, makes sense. Can you give us a couple of little did called use cases that you see out there late. Let’s say, new way that someone needs to build them up what’s onboarding drive, but why? What are like some, usually, but like an oversight valve? And even more as like, what are some examples of like, incorporating cultural insights, like either an application or a knowledge and use case?

Gabriel Hidalgo 17:20
For example, in for internal uses in a company, you want to see like what stores are, for example, performing the best. And so then you’re able to like layer data inside of it, maybe the stores are too close together, you find out like, you’re able to, like make a buffer of like, and use some routing information to be like, oh, like people walking around this area. This is the general sense of from this store to here. They can walk 10 minutes walking. So then you get an idea of like, oh, okay, like, only these people were on here care about the store?

Eric Dodds 17:54
Is that how you would define the buffer as a term? Is it like there’s a 10-minute walking buffer around the point?

Gabriel Hidalgo 18:00
Exactly. So in that sense, if they interlock, they overlap each other these 10-minute walk, you’re like, oh, maybe these stores are too far apart from each other. As an example of like a use case, in like more of the side of like, just creating a map or for internal use. On the other side of things. And these are more technical people is that they have this, they have their internal systems like they have these giant servers, they have all these data in cloud warehouses that are like trillions of rows, then they just need someone to be able to, like create or make an app for them really easily. So they just need something that they also have security as well. So we can like have a self-host, we call it where we have a tenant, and they can have their information in their own servers if they want, but we just have Carto as kind of like a front end for it. And then we do it. So in that way, it’s kind of like another app in their system that they can use internally. And that’s another use case they can do.

Kostas Pardalis 18:59
Cool. Okay, last question from me.

Eric Dodds 19:03
I don’t believe you.

Kostas Pardalis 19:07
What is something exciting that is coming?

Gabriel Hidalgo 19:13
Absolutely. Yeah, I think I came at a very, very lucky time in Carto to be honest with you like this was unplanned. And I just like hard overall, but it’s this. So we actually transferring over to a new platform. This whole cloud-native nudity is actually very new. Like we’re, we’ve been trying to build this platform for a while. And it’s completely new, the new accounts, new everything, we’re transferring people over. And it’s we’re building out a new platform, and that’s super exciting. Like, our current platform is great. But we see the trends are going on and people going to data warehouses, we see this, people wanting more control over their data. And so we want to help enable that. And so with a new platform, you get a bunch of new we’ve we’re creating features that will have always wanted. Like, I think one of them we call a lasso tool they are being the old Carto, you’re able to zoom in to be able to see them. But people want it to be able to like, select a certain area, only this areas that we’re having to zoom in, and just figure it out from there. And now we can just do that, like there are so many new features coming and where it’s not just like, Oh, we’re just waiting to make it catch up. So it’s like, no, we’re not, we’re like, no, well, this platform is going to be a lot better than the other one. And so that’s super exciting to see that change. And like while we prioritize and all the new features that come out that haven’t existed before.

Kostas Pardalis 20:39
Oh, nice. Awesome.

Eric Dodds 20:41
So question, do you have customers who basically white label maps through Carto and serve those to their users?

Gabriel Hidalgo 20:54
Yes, we do. I guess white labeling would be a way I need to be explained a little more like, it’s more about like, we have password protections for like, a certain map. So when you create a map, you can make it completely publicly available. Yeah, where you can put it behind a password. So if you want someone to see it, or you can like actually take the map and make it into an iPhone to put it on the website. Oh, interesting. Okay. And so that’s kind of a way or was again, like, I think, in Morse, like people have can have their own Carto and like, they can just have it internally. And they can actually decide how to share this map.

Eric Dodds 21:31
Yeah, super. It sounds like it’s less of a, okay, I’m going to start a consumer mobile app. And like, there’s a mapping component to it. So I need like sort of this interactive map, right, it sounds much more geared towards using like specific data to create specific maps or like solve really specific questions related to spatial data, like inside of companies.

Gabriel Hidalgo 21:58
Yep, exactly. In some cases, yeah.

Eric Dodds 22:00
Yeah. Super interesting. And how long has the company been around?

Gabriel Hidalgo 22:05
I’m not really sure about that, to be honest with you. Like 10 years.

Eric Dodds 22:09
Oh, wow. Okay, cool. And we didn’t even— I dropped the ball. I didn’t ask you this the beginning, but what do you do at Carto? I’m so sorry, Gabriel.

Gabriel Hidalgo 22:20
No problem, no problem. So I’m a support engineer at Carto. This has a very unique because I know when people hear Support Engineer, it’s like a very broad term, like Solutions Engineer. So it really depends on the company, I think the idea of support within Carto is actually support within the company like we, we are full stack team, we work on front end, back end servers, everything possible to make the app run and improve it. But also, we support users, like very technical users to not-so technical users to just get the most out of the platform. And we actually have the same system internally, like within our team, people come with us with question the same system we use for our customers, because we want to like have this whole system to make sure that we’re providing the same quality we’re getting, and the customers are getting.

Eric Dodds 23:08
Oh, interesting model. Okay, so you treat sort of internal customers the same way that you treat external customers?

Gabriel Hidalgo 23:15
Exactly.

Eric Dodds 23:15
Fascinating. Yeah, that is super interesting.

Gabriel Hidalgo 23:18
And a very high technical team, which is also very interesting.

Eric Dodds 23:20
I’m sure that makes your job super interesting. Okay, I’d love to know, especially because you really have approached a lot of things, it sounds like, in your career from like, observing like an interesting, like problem or an interesting opportunity. What are maybe one or two of the coolest things you’ve seen your customers do with Carto? Or like, the coolest problems that they’ve solved with Carto?

Gabriel Hidalgo 23:52
One of the coolest things for me is more utility. I’ve seen people make a huge impact on people’s lives. Like I’ve seen the simplest maps actually are the most interesting to me, because the data coming in, and the impact that has like, I think, like elections, like literal interest voting view, like you. So I think those are really interesting because they see more the impact of the data set itself. On the technical side, actually funny enough, it’s how people use the product, like you see some people that are using the UI. And it’s such an interesting way, like, we had one where you zoom in, and you’re able to see like, the congestion number within the road and you zoom out, they have a different one. We’re like, why would you want this? And then they’re like explaining to us that like, hey, it’s because I do zoom level, you see this part of the area and on this user received this part. So we want to make that distinction. This how we’re like, and we’re like, oh, that’s like it wasn’t? Well, if you’re using it this way, then it’s meant for that, but I just we’ve never thought of using it that way before. So those are like the like, we, to be honest with you. I think most of they’ll have really interesting features we think about actually through customers because they just want the solution. And then we’re, and they explain it to us. And we’re like, oh, okay, like, this is why this makes sense. And, and that makes total. And we should actually add that.

Eric Dodds 25:13
Yeah, totally. super interesting. Well, Gabriel, this has been such an interesting show, I learned so much about spatial data, not location data well, about location data that I didn’t know before. So thanks for taking some time out to chat with us. And best of luck on your journey as you help people solve spatial data problems.

Gabriel Hidalgo 25:33
Thank you so much for the conversation, it was great.

Eric Dodds 25:37
Well, obviously, I had a lot to learn about spatial data and the difference from location data, which was really educational for me, it makes so much sense now that I talked about it. But I did kind of make a generalization there for the product. Carto, actually, it’s a really important distinction. So I love learning those things. I think this is my big takeaway, gave you a really bad a human element to thinking about this. And his story about working in the Peace Corps. And standing on a sidewalk with a clicker and collecting traffic data for I think it was a city in Albania. It was just in some ways, like really heartwarming, right? I mean, it’s so easy for us to get caught up in the technology. And I really loved hearing about him making a real impact in this city, with really sort of analog data collection and data processing. But it really made a big difference in the city. It was really fun. And I think you can tell that the way that he approaches his work at Carto is influenced by that very human-like, element of it sort of visceral experience capturing and using spatial data.

Kostas Pardalis 26:45
Yeah, yeah, 100%, I think it’s also like, so, so refreshing to meet and connect with people that they are so excited about the stuff that they are doing, and have like, this very genuine way of like, connecting with their work and feeling that they are doing things that are important, like the deliver value like that’s, of course, like to do that you have to add this human dimension to the technology itself. So, yeah, I really enjoyed, like the conversation that we had, I think it’s important to have more conversations with companies that they are working in less traditional types of data, let’s say, as we said before, I think we tend to forget that data is not just like the tabular data that we have. And so the very well type data that the database holds. And I think these data like the outside of like the tabular data, we will see more and more of a need to work with these new different types of data. And there are companies out there and I think we should reach out to them and make sure we get them on the show to see what they’re doing and why they’re doing it.

Eric Dodds 27:59
I totally agree. And I think Brooks is already on it. All right. Well, thank you for joining us. This has been an awesome week. This is our last episode from the week being on-site Data Council. And we have more we’ll be back to your normally scheduled programming on the show next week. Thanks for joining us, and we’ll catch you in the next one.

We hope you enjoyed this episode of The Data Stack Show. Be sure to subscribe on your favorite podcast app to get notified about new episodes every week. We’d also love your feedback. You can email me, Eric Dodds, at eric@datastackshow.com. That’s E-R-I-C at datastackshow.com. The show is brought to you by RudderStack, the CDP for developers. Learn how to build a CDP on your data warehouse at RudderStack.com.