On this week’s episode of The Data Stack Show, Eric and Kostas chat with Jenna Lemonias, director of data science at The Atlantic. The Atlantic, a publication that’s been around since 1857, is adapting with the times and is implementing and emulating some of the data science practices seen at big tech companies.
Highlights from this week’s episode include:
- Jenna’s background in astrophysics and how she pivoted to data science (2:14)
- Differences in dealing with data at a FinTech company and then at a publication (4:40)
- The relationship between analog and digital data at The Atlantic (9:22)
- How The Atlantic structures its data science team (11:44)
- The role data engineering plays (14:42)
- Using natural language processing and machine-generated metadata (17:37)
- The Atlantic’s data stack (28:22)
- The kind of data that’s important to The Atlantic (29:44)
- Big projects forthcoming for the data science team (37:13)
The Data Stack Show is a weekly podcast powered by RudderStack. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.
RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Eric Dodds 00:06
The Data Stack Show is brought to you by RudderStack, the complete customer data pipeline solution. Thanks for joining the show today.
Eric Dodds 00:18
Welcome back to The Data Stack Show. Many of our listeners have almost surely read content from the publication, The Atlantic, a very important publication that’s been around for over 150 years. And today, we have the privilege of talking with Jenna, who runs the data science practice at The Atlantic. My burning question is really around, they have both digital data and analog data, and I don’t know if there’s anything there, but I think it’s just really interesting to think about the role of a data scientist who has probably an unbelievable wealth of digital data, but also has to consider that they have a lot of customers who read the analog version of their publication. Kostas, what do you want to find out?
Kostas Pardalis 01:03
For me, it’s going to be very interesting to see how exactly data science fits in the publishing organization. I think it’s the first case of a publisher that we have on this show. And there are many different aspects of data that they can really use there because they have the content itself and they also have like all the rest of the different data that we usually talk about, like customer-related data, and financial data and all these things. So I’m very curious to see what’s the difference between them and any other product out there that relies on data?
Eric Dodds 01:35
Great. Well, let’s talk with Jenna.
Eric Dodds 01:37
I am so excited to talk to our guests, Jenna from The Atlantic. Jenna, welcome to the show.
Jenna Lemonias 01:44
Thanks very much. I’m happy to be here.
Eric Dodds 01:46
Okay. I always say, I have so many questions for you, which is true of every guest. But the media space and data has been a fascination of mine for a while, just because it’s undergone so much change. But before we get into that, Jenna, we love to learn about the backgrounds of people who work in data. And so could you just give us a brief background of you know, where you came from, and how you ended up working in data at The Atlantic?
Jenna Lemonias 02:14
Sure. So I went to graduate school after college intending to do scientific research and pursue a career in academia. I ended up getting my PhD in astrophysics, but decided to bow out of academia for a number of reasons. I found myself in the Bay Area after graduate school and decided to pivot to data science; of course, data science and all things tech are ubiquitous in the Bay Area. I started at a FinTech startup, where I was the second member of the data science team and learned a ton there and made my way over to The Atlantic where I’m really happy to be now.
Eric Dodds 02:55
Very cool. We may need to do an episode on academic backgrounds of people who work sort of in modern data context, because that’s just been a repeated theme throughout the show and actually cost us You came from an academic background as well.
Kostas Pardalis 03:09
Yeah, yeah, that’s true. I mean, it’s very common to see people who are working with data these days that are coming from very diverse scientific backgrounds, and especially like people from physics. We had another guest, who was as part of his PhD, he spent, like a big part of the PhD program, doing big data analytics with CERN. And then he went into the industry and started like working with data there, mathematicians, we had another one who was doing neurosciences, I think. So we have seen quite a few PhDs from these, like more scientific, let’s say, disciplines going and working with data. So it’s very exciting to see that this pattern continues. I also have like an academic background, but I’ve never pursued like a PhD. I avoided it. But yeah, I was working kind of an academic environment for quite a while, doing stuff around data. So yeah, it’s really very, very, very interesting. And I think we should have an episode just for that. I think there are many insights to draw from there. It’s really nice to see all these people, by the way, being part of the industry, because they have a lot of like, unique talents that can be extremely, extremely useful.
Jenna Lemonias 04:18
Kostas Pardalis 04:19
So I’ll start with a very quick question on what you mentioned. Jenna, you said that you started working in a FinTech company. And now you’re in The Atlantic, which is a publication, still working with data in both cases. How is the experience different from going from FinTech to a publisher? And what things are common as you work again with data?
Jenna Lemonias 04:39
Good question. So one thing that’s different is that when I was at the FinTech company in the Bay Area, I was the second member of the data science team. And so it was obviously a really small team of two. I learned a ton. I learned a lot about really what it meant to be a Data Science Team within the context of a business, you mentioned how so many of us have come from the academic research side where you can spend months or years on a research project, just because it’s interesting. Whereas, of course, in the business side, you want to do projects that will have an impact on the business, or at least have a high likelihood of having an impact on the business. So that’s a big part of what is similar about the two. And, just in general, it’s been interesting to be at The Atlantic, which is a 160-year-old institution, right, that’s now trying to essentially emulate some of the data science practices of tech companies. So it’s been really exciting to be part of that transition.
Kostas Pardalis 05:56
Oh, that’s very interesting. And how’s your experience with that so far? How is it that such a well-established organization in a very established market also, right, is trying to adopt, like new methodologies, new techniques that are complex in the tech industry? How do you feel about this and how you have experienced this inside The Atlantic?
Jenna Lemonias 06:15
So much of what we’re trying to do is really similar to what a lot of other companies are trying to do, and that we’re trying to understand what makes someone want to pay for our product. And of course, our product, in this case is journalism. But we have a lot of the same data that other companies do. We can see how people are interacting with our site, what they’re reading, and what makes people decide to convert, what articles are they reading, prior to subscribing. And so a lot of it is so similar.
Eric Dodds 06:52
So quick question on that, Jenna, which is interesting, so, you know, Kostas, and I both work in product and have sort of been around product. And when you think about a typical SaaS product, let’s just use like a new feature rollout as an example here. So you roll out a new feature, it allows a user to do XYZ, right? They accomplish some sort of task, right? And so there’s certainly a similar paradigm when you think about content. But a lot of times the way you measure activation on a feature and product is I wouldn’t necessarily certainly say binary, but what you’re looking for is, are people using this or are they not? Is there a different approach with content where there’s a much more qualitative nature to it? I mean, the person is reading the article, they’re engaging with the content. But is it harder to sort of triangulate performance, if that makes sense relative to say, like, a new feature you roll out that allows a user to do X in a FinTech app?
Jenna Lemonias 07:57
Yeah, that’s an interesting question. I would challenge in some ways the idea that it’s that different; of course, when someone is interacting with the print version of the magazine, we do not have data on that at all. But when we’re thinking about how people are using the site, we can think about, you know, whether they’re saving articles, and whether they come back to them later. And so that’s an example of a product feature that we can actually very easily quantify. And when we’re thinking about depth of engagement, and overall engagement with our journalism, we can look at how often are people coming back to the site. And so I think it’s actually not that different in the end.
Eric Dodds 08:44
Yeah, that’s a really, I think, really valuable way of looking at it. And you mentioned print, one thing that is really interesting to me is, you have some access to data around print, sort of distribution and readership. How does that play into the work that you do in data science at The Atlantic? And especially, I guess, I’m interested in, you know, in combination with the digital data that you have, because that’s obviously more abundant and probably more accessible, but would just love to know about the relationship between those two types of data?
Jenna Lemonias 09:22
Sure, the easy answer is that we really don’t have that much analog data. We know who is receiving a print magazine, and who isn’t, of course, and so we can take that into account when we’re interpreting users on site behavior. So if someone isn’t coming to the site as much or they’re not opening the daily newsletter, and they have a digital only subscription, then we know that this is someone who is potentially concerning or at risk of churn and we want to try to re-engage them. If however, we know they received the print magazine, we know there’s a there’s a chance that they are really valuing their Atlantic subscription, but just valuing it in a different way, and just reading our articles, you know, on the couch in the magazine over the weekend. And so that’s something that we just we just don’t have data on. There is a small audience research team that’s separate from data science that conducts surveys and interviews where we can glean some insights from that. But it’s really not something we can quantify other than helping us to interpret on-site behavior.
Eric Dodds 10:32
Very interesting. One quick follow up question on that. Have you seen any patterns and this is just more of sort of a just … I have an interest in human behavior … do you see any situations in which a print subscription actually increases online engagement for maybe types of content that you’re producing online that don’t make it into the print edition? Like the different formats may feed off of each other?
Jenna Lemonias 10:59
That’s interesting. I don’t think that’s something we’ve looked at in particular. Actually, the vast majority of our journalism isn’t in the magazine. And so in order to really benefit from The Atlantic as a whole, we would, we would want those people to come on site. But again, there are people who just prefer that analog reading experience.
Kostas Pardalis 11:23
That’s pretty interesting, Jenna. My question is a little bit different has to do more with how they work and how the teams are organized inside the Atlantic, I mean, the data science department. So can you give us a little bit more information about your team, your role in the team, and how the overall data science organization inside The Atlantic is structured?
Jenna Lemonias 11:44
Sure. So data science is part of the growth team. There are five of us on the data science team. And we run the gamut in terms of experience. We have an analyst who’s done a lot of digital marketing, and we have people who are also focusing more on predictive modeling and natural language processing. And so data science as a part of The Atlantic started several years ago, knowing that we were going to launch a paywall at some point. And so as such, we are mostly thinking about subscriptions, and propensity to churn and propensity to subscribe. But we’re also, as part of that work, working closely with marketing and product and engineering. But we absolutely work across the business. We work with people in the newsroom and the advertising side as well.
Kostas Pardalis 12:45
All right, so would you say that like your primary engagement inside the organization is with marketing?
Jenna Lemonias 12:52
Probably not primary; I would say both marketing and product. Because there’s a lot of overlap between the two.
Kostas Pardalis 13:02
Okay. Yeah, that’s a very interesting distinction. I would like to learn a little bit more to be honest, if you could share some information. So, inside the Atlantic, what’s the definition of product and how does it differentiate between, for example, the people who write the content, and there’s like the traditional kind of approach in publishing. So can you talk a little bit more about that? Because especially for me, I’m very product oriented, I’m super, super interesting to hear about that.
Jenna Lemonias 13:28
Sure. I think you might be asking more about the editorial side in the newsroom. So all of our writers are in editorial or the newsroom. And they’re completely separate from product, actually. So when we think about product, we think about the app, we think about our subscriptions product, right? So are we offering discounts? Or is there a way to upgrade or downgrade your subscription? And also just use of the website, if we want to add a feature where you can follow your favorite writer or something like that.
Kostas Pardalis 14:07
Oh, wow. So the Atlantic right now is not that different than a tech company at the filing, right?
Jenna Lemonias 14:14
Kostas Pardalis 14:15
Yeah, that’s super interesting. That’s super interesting. And it would be like amazing to learn more about this transition from like such a well-established and old institution, turning from a publisher more of let’s, let’s say a tech company, that’s part of the product is the content, which is I think it’s an amazing journey. You mentioned engineering, and that you work with, with engineering, What’s your relationship with them? Like, how is the data science team working together with them?
Jenna Lemonias 14:42
So we have a small data engineering team. And so we think of them as essentially providing raw data for the data science team, and so they’re building out ETL pipelines, they are managing our workflow management tools. So we’re moving on to Apache Airflow. And then of course, we’re working with engineering in terms of setting up web analytics, and just making sure that everything that we want to track is trackable so that we can ensure that we can, for example, quantify how well a new product feature increased conversion or decreased retention, and what the adoption rate was.
Kostas Pardalis 15:28
Oh, that’s very interesting. So like the data scientist team is actually, let’s say, the consumer of the output of the data engineering team, right? They are there to support you and make available the data that you need. So you can build your models or do your analysis and drive business decisions and the product. Did I get it correct?
Jenna Lemonias 15:49
Yeah, yeah, that’s mostly right.
Kostas Pardalis 15:51
That’s perfect. So I’ll get back. I’m sure that also Eric has questions to ask about how you work together and the technologies used there with data engineering. But do you think you can share with us what is a typical day working with data at The Atlantic? How does the project start in the data science team, how the work is organized, what you’re doing with this data, what kind of iterations you do with them, and in general, give us a little bit deeper insight of how data science works with the available data in the Atlantic?
Jenna Lemonias 16:23
We have a small team, but because of that, we’re all doing a little bit of everything. And so data science work runs all the way from business intelligence, and dashboarding, to A/B testing, deep dive analyses, all the way over to predictive modeling and natural language processing. And so, we’re obviously, each one of us, specializes in one or two of those areas. But so our day-to-day work really depends on which one of those things we’re working on. I definitely try to emphasize that we are part of this institution that we’re trying to support. And so we want to make sure that any insights we have and any models we build are really actionable. And so I do place a lot of value on making sure that we’re communicating our work to other teams.
Eric Dodds 17:23
Jenna, Eric jumping in here, could you tell us more about the types of … I know, you may not be able to share everything … but what types of projects are you working on that involve natural language processing?
Jenna Lemonias 17:37
We’ve built a topic model, and we’re also trying to build out our metadata. So metadata is a interesting thing in the journalism world that I can talk a little bit about. So metadata is essentially at a really basic level labels for each article. And so we can think about that as the author of an article or the section it was published in, for example, science or politics. But we can use NLP to assign other metadata for each article. So we’ve assigned topics from a topic model to each article. We’ve also run some named entity recognition models as well. We can assign sentiment, of course, to each article. And the point of doing all of that is essentially to make our analyses more sophisticated to make our understanding of readers engagement with journalism more sophisticated. We can use it to power personalization, research engines, etc, etc.
Kostas Pardalis 18:45
All this is great, actually, metadata is a very interesting topic. The use cases that you mentioned there from what I understand, most of them are internal. You’re trying to understand like the content and how the users interact with it. Do you use this metadata also, as part of the product? Like the first thing that comes to my mind is how you can provide a better search functionality, right? Like the users browse the content that you have, based on these topics, or provides recommendation systems for that. So what is the value of like these metadata to the product itself?
Jenna Lemonias 19:18
So we’re absolutely thinking of this as something that can lead into re-circulation and recommendation engines. There’s also the idea that it can power essentially category pages so that you can click on “Coronavirus” and I’ll just automatically list everything that had the label “Coronavirus” with it. We’re not quite there yet. Of course, there’s an interesting tension between the data science and then the real world products, because of course, as data scientists we understand that not every topic assigned is going to be 100% accurate, like we understand that you know, some models are better than others. But when it comes to actually putting something on the page that a reader will see, we want to be a lot more careful and just have a higher bar for what gets on the page. And so we’re not quite there yet, where we’re okay servicing this type of thing to the reader. We’d want to have some sort of essentially veto power.
Kostas Pardalis 20:27
Make sense. Make sense. So, traditionally in the publishing world, because okay, I guess that this kind of metadata, it’s not like something new people use like to catalogue content and try to come up with topics and all these things for like, many, many years before technology was there and data science. So how was traditionally this done, and how useful it was in the past.
Jenna Lemonias 20:48
Of course, before we had machine-generated metadata, you could have human-generated metadata. And I believe this is still a team at the New York Times. And so it absolutely can be really powerful. And we’ve actually also been thinking about what type of metadata we might want, that we wouldn’t need to be human generated, essentially. So if we want, if we want to label articles as a personal essay, or ass omething written by a presidential candidate, like we could probably develop some sort of sophisticated algorithm. But at a certain point, it’s possible that it’s just easier and faster to have it labeled by a human. And so that’s absolutely how it used to work.
Kostas Pardalis 21:40
That’s really fascinating. So is there today, any kind of cooperation between these two? Like you have the team who is responsible for coming up with this metadata and the topics and creating categories of the content that you have? And then you also have the algorithms.
Kostas Pardalis 21:53
I’ll give you an example. Just to make the question more clear. Do you use, or do you see using the data that is coming from these people to feed and train your models, for example, or vice versa use the outputs of the models to help them like much faster come up with these topics? Is this something that’s happening today or thinking of doing it in the future?
Jenna Lemonias 22:11
So the latter we would definitely think of doing. I think if we wanted to develop essentially category pages, we would want an editor to be able to go into our content management system, and essentially be able to say, like, no, this is wrong. We don’t want this to be in this category. But I will say we’re a small enough company that we don’t have a lot of that human-generated metadata right now. I think it’s something we might do more of, but of course, that is outside of data science world.
Kostas Pardalis 22:46
That’s great, quick question around that. So you said that you don’t have right now the the amount of data that you need to train these algorithms, can you give us like an example of algorithms that you are using?
Jenna Lemonias 22:58
I don’t think it’s the quantity of data that we’re lacking, because we also have a large archive. So yeah, I think that isn’t the problem. We’ve been looking at spaCy in terms of named entity recognition and some other natural language processing capabilities.
Kostas Pardalis 23:18
Oh, that’s cool. And what’s your experience so far with it? And how do you feel about the technology? I mean, have you seen it progressing in this space? And what do you anticipate in the future in terms of like improvements in these algorithms.
Jenna Lemonias 23:32
We had originally built out one of these models a few years ago, and then recently updated it with with a newer model. And it vastly outperformed what we had seen before. So we’re absolutely seeing a lot of progress. And so we’re excited to put it to work.
Eric Dodds 23:52
Very cool. Question on models. And we’ve we’ve talked with multiple data scientists on the show. And one interesting subject, especially, you know, depending on different industries is just the question around, you know, bias and models or the way that you build models. And one thing that’s interesting, and this could be not even a consideration, but just thinking about the editorial aspect of journalism, right? So you have an editor of the magazine and there are decisions that are, say, maybe subjective isn’t exactly the right word, but I can’t think of a better one. So the editor in some senses, sort of saying, you know, these, these topics are important for this edition of the magazine. This is what we’re going to write about. Did those sorts of things influence the way you think about building models because it seems like there can be a significant human element in what actually gets delivered in the product as content that’s an editorial decision.
Jenna Lemonias 24:59
Yeah, that’s a good point. And I guess I’ll say that most of our modeling thus far has been more on the propensity modeling. So on the subscription side. So it’s pretty independent of the editorial side of The Atlantic. When we do get to the editorial side, I think we absolutely know that our editors are the experts here. And we don’t try to tell them what to write about. We essentially tell them what is performing well, what kind of content is really resonating with our audience, and then the final decision is with them. As for the models that do have to do with the articles, I’d say we’re still in the early early stages of trying to figure out what we’ll actually be doing there before we’re actually dictating what sort of content is surfaced to a specific person.
Eric Dodds 26:03
Sure. That’s really interesting. And I mean, it’s, it’s really neat to hear that there’s a relationship between the people making editorial decisions and data science, I just think that’s a really cool well, I mean, cool is one word for it. But I actually think, a very modern way to approach operating inside of a publication.
Kostas Pardalis 26:24
Yeah, and I think also a very positive one. As all this time, that Jenna’s talking about how The Atlantic is using data science and data in general, I think it’s a very good counter example for this fear around that, like ML is going to destroy jobs that you know, like, we are going to have Terminator coming and other stuff. When you actually see inside an organization how technology can work very closely and help the professionals you have over there to focus more on the creative part of the work they’re doing and be much more efficient and creative at the end. Which, I think that’s also the vision with technology in general. And things are not binary, they’re not like black and white. Either the technology is going to do something or the humans are going to do at the end. I think the real value is when you combine these things together. So I think it’s a very encouraging and positive example that we have here. What do you think, Jenna? What’s your opinion on that?
Jenna Lemonias 27:16
It’s really exciting to be working at a company that’s been around for such a long time. And that can really drive the national conversation. And so I’m really just happy to be part of it.
Eric Dodds 27:29
Jenna, just out of curiosity, do you on the data science team interact with the journalists themselves at all?
Jenna Lemonias 27:38
Sometimes for sure. There are a few of us who interact with them more often, just because they’re, you know, accustomed to having those types of conversations. But yes, is the short answer.
Eric Dodds 27:52
Very cool. That’s that’s just need to hear. Okay. Well, we love to have philosophical conversations on the show. But I do you need to ask, I think for our listeners about your your toolset and data science. And I know you talked a little bit about modeling with Kostas. But could you just tell us a little bit about the tools that your team uses, the stack, and then maybe even if you’re not using them, what other tools that are sort of new and exciting to you in the data science space.
Jenna Lemonias 28:22
So all of our data is in BigQuery. And I mentioned that our data engineering team write some of those ETL pipelines. And then we have data connectors that are bringing in our subscriptions data and a number of other pieces of data. We use Looker for our dashboarding tool. And many of us are using Python for analyses. A couple of people are using R as well. And then all of our models are running in Python on AWS.
Kostas Pardalis 28:53
Jenna, quick question. You mentioned both BigQuery and AWS. What’s the reason of using two different vendors there?
Jenna Lemonias 29:00
That’s a good question that I don’t have the answer to, and I would I would defer to the engineers who made that.
Kostas Pardalis 29:05
Okay, so it’s purely an engineering decision. Doesn’t have to do with, like, what the data scientist feels more comfortable to use, right?
Jenna Lemonias 29:14
I mean, honestly, those decisions were made before I got to The Atlantic, and we’ve been we’ve been happy with how it’s been working out.
Kostas Pardalis 29:22
Okay, that sounds great. Okay, you mentioned like the tools that you’re using, and we have talked so far about doing a lot of work with the data that you have the actual text that you have, from all the things that get published. What other data are you using? You mentioned subscriptions, are there any other data sources that you’re using that are important to your job?
Jenna Lemonias 29:44
The big ones are definitely the web analytics and the subscriptions. We’re really digging in on subscriptions more. We’re doing a lot of attribution modeling recently. And so we’re thinking about that in two different ways. So we’re trying to attribute new subscription purchases, to of course, various traffic sources as our paid marketing, driving subscriptions. What about when articles are going viral on Facebook or people are clicking on newsletters, and then coming in and subscribing. And then we’re also thinking about attribution in terms of articles. So what types of articles are people reading? And then immediately subscribing after?
Kostas Pardalis 30:29
That’s, that’s super interesting. Are there any other any behavioral data that you’re tracking? Are you interested in how your user interacts with both the application and your content? And how do you measure that?
Jenna Lemonias 30:41
We probably use the standard … we look at sessions, we look at how often people are looking at the homepage, we find that homepage traffic is often a pretty clear sign that someone is more likely to subscribe. Because they’re not just reading an article, but they’re actually going to the homepage to see what else we have to offer. And then we also have, some of our some of our articles are short, but we also are probably known for having a lot of long form articles as well. And so we definitely pay attention to scroll depth to see if people are getting getting to the end of the articles.
Eric Dodds 31:19
Jenna, you mentioned, you know, if an article goes viral on Facebook, have you done any work around patterns of articles that tend to go viral?
Jenna Lemonias 31:32
We have. And of course, this happened a lot in the past year, because there was so many …
Eric Dodds 31:42
Yeah, many, many things to write about, that everyone wanted and needed to read about.
Jenna Lemonias 31:47
Exactly, exactly. And I mean, it’s not a case where we can learn about what happened and then replicate it, right? Like you can’t really replicate a story that goes viral, because you wouldn’t necessarily expect it. There’s always way too many ingredients to really predict that. But what we have looked at is, articles could go viral on Facebook, or Google search, or Twitter, and then different … so the pageviews could be higher and one or the other. But then in a lot of cases, we’ll get a higher conversion rate from a source that might not get as many pageviews. And so we’ve absolutely seen some really interesting trends there.
Eric Dodds 32:38
That is fascinating. It makes sense that you’re not, you know, trying to produce, like listicle clickbait that goes viral, right, because that’s not the type of content that The Atlantic produces. And so it is interesting to to hear you say, you know, you can’t necessarily, like influence, like, you write this and it will go viral. But it’s fascinating that the metrics around the different platforms, and then the various forms of “performance” pageviews versus subscriptions, has a lot of variance. That is so interesting.
Kostas Pardalis 33:13
So Jenna, just move forward and talk a little bit more about you, as we are reaching the final part of our conversation today. Can you share with us a little bit more information around your whole experience so far, going from the academic environment, to going to a FinTech company in the Bay Area? And then going to the Atlantic? I mean, what are the common things that you see, you mentioned something very interesting in the beginning about the difference in the focus of the work and how important the impact is in the industry compared to the scientific world. But what are other differences? And more importantly, what are the common things that you see there?
Jenna Lemonias 33:52
I’d say the work itself is probably surprisingly similar. You know, data to a certain extent, data is data, or data or data. And so I think once once you’re really familiar with data and you know, how to, you know, ask questions of data, you know, when something doesn’t look right, and you know, how to essentially be creative with analyzing data. I think you can probably use any sort of data. So yeah, I’d say that that’s a really big similarity. I think one thing I’ve enjoyed outside of academia is being able to work with a really wide range of stakeholders. So not everyone has the same data literacy, or the same goals really in understanding the data. And so it’s been really great to flex that muscle.
Kostas Pardalis 34:54
Yeah, that’s very interesting. I remember a friend of mine who did his PhD in machine learning and computer vision ended up working for Facebook. And at some point, we met and we were talking together and he was trying to describe to me what he was doing there and pretty much like the work was not that different between the two environments. I mean, he was still writing papers and doing publications and creating models and stuff like that. One major difference that I I saw there was, and what he was trying to communicate to me is that, I’m much more stressed now. Because, okay, it’s one thing to write a paper and have like a peer reviewed publication. And it’s another thing to know, at the same time that this model is going to be affecting the experience of like billions of people, with Facebook for example. So I found this like, extremely, extremely interesting. That makes things of course, like more stressful, but also, I think, much, much more exciting, in my opinion.
Jenna Lemonias 35:59
Yeah, I 100% agree with that. I think in academia, people think about making an impact on a very, very long timescale. And so things definitely feel a lot more urgent, I would say on a day to day basis. But I like that.
Kostas Pardalis 36:19
Yeah. Yeah. Makes sense. So after being outside the academia for like, I guess, like some time now, how do you feel about this decision? How happy you are, let’s say, and how do you reflect back then, when you had to make the decision to go to the industry.
Jenna Lemonias 36:35
I am really happy in data science, I’ve really enjoyed, as I said, being able to work with a wide range of people and flexing more muscles, and learning how to communicate what we’re doing to a wide range of people. But I am absolutely a big proponent of basic scientific research still.
Eric Dodds 36:56
All right, one last question for you, Jenna, before we wrap up, what are the big projects you have coming down the pipeline at The Atlantic from the data science perspective? You mentioned some, but what are some that haven’t started yet that you’re particularly excited about?
Jenna Lemonias 37:13
I mentioned to a certain extent a recommendation engine. But I can talk a little bit more about that, because it has a lot of different potential use cases from sending out personalized emails with reading recommendations and driving recirculation modules on our site. And this has been … we’re really only at the beginning of this project. But it’s been fun because it’s required us to partner really closely with engineering, because of course, our team can build out this model or this algorithm, but there’s so many more steps that have to happen in order to get the results of the model onto the site and make sure the right person is seeing the right recommendation. So that’s something we’re really excited about seeing to fruition.
Eric Dodds 38:02
Yeah, it is. That’s something that we hear about a lot. I think a really encouraging trend is that sometimes you’ve heard it referred to as the last mile. You know, if you think about personalization, I mean, it’s certainly a huge accomplishment to build a model that does that well. But then you have to get the results of that model into an actual user experience, and even a pre-existing user experience many times, right. So you’re sort of fitting the results of a model into a pre-existing user experience. And from a technological and design perspective, that is non-trivial.
Jenna Lemonias 38:38
Yeah, it’s a great example of how we cannot do our work in a silo. And we really need to collaborate with other teams pretty early on to understand exactly how we should complete this so that it can be implemented correctly.
Eric Dodds 38:54
Very cool. Well, one last follow up question on that. I lied. I have one more question. Have you developed a rhythm … and I know it works differently in different companies, and especially just sort of based on the business model and the industry, different methodologies work, you know, work better or worse, depending on the situation … but collaborating early on when something like recommendations originates? Is that usually sort of the initial concept originating from data science? Or is it maybe originating from product who says, you know, we have an idea for something that may improve the user experience that involves personalization, or is it both?
Jenna Lemonias 39:34
That’s a good question. And I would say that we’ve had projects start in both places. So our attribution modeling work was definitely a request from more of the marketing side. For recommendation this was something that data science was interested in. But then of course, we had to essentially find a product stakeholder who would evangelize it and see it through and make sure it gets on to the roadmap of the product and engineering team.
Eric Dodds 40:07
Very cool. I mean, there are way smarter people than me that sort of do you know, org and operational design. But I think from all of the companies that we’ve talked to him on the show, it seems like it’s very healthy to have a symbiotic relationship where data science is pushing value back into the organization in the form of ideas, but then people who are building the experience for the users are bringing needs to data science. And that seems to be sort of a relational dynamic among the teams that produces just really, really interesting and valuable experiences for users.
Jenna Lemonias 40:46
Yeah, I completely agree.
Eric Dodds 40:48
Well, Jenna, this has been absolutely wonderful. Thank you so much for joining us on the show. We’d love to have you back on in the future, especially when you’ve had a chance to work on some of the personalization stuff to hear more about that. And we may even prevail upon you to ask someone from data engineering to join the show, because I know Kostas has burning questions about using both Google Cloud and Amazon.
Jenna Lemonias 41:13
Sure. Thank you so much.
Eric Dodds 41:16
As always, we learned so much. It goes without saying but the academic background is something we just need to do an episode on. Because it’s been such a pervasive theme throughout the show. I think what was really interesting to me was to hear about the way that they seem to think about the relationship between sort of hardcore data-driven functions like data science, and the sort of more subjective functions that are editorial, and that are very human based. And I think, you know, thinking back on conversations with other data scientists we’ve talked to, the human element continues to seem to be like one of the most interesting things that data scientists deal with. And it was really encouraging to me actually to hear about the relationship that they seem to have at The Atlantic.
Kostas Pardalis 42:06
Yeah, absolutely. I totally agree on that. And I think it’s like a great example of how technology can work together with the human factor inside a company. Outside of this, I was extremely excited to hear that a company as old as The Atlantic, like 160 years old, is actually a technology company today, which is crazy to think about what kind of transformation in this 160 years this organization had to go through and how they still are able to adapt, which is amazing.
Eric Dodds 42:37
It is pretty wild to think about the state of basic technology around things like electricity and other things like that when The Atlantic started, and now they operate like a Silicon Valley product. That is fascinating. All right. Well, we could say so much more. But thanks again for joining us on The Data Stack Show. Be sure to subscribe on your favorite podcast network to get notified of new episodes every week, and we will catch you next time.
Eric Dodds 43:06
The Data Stack Show is brought to you by RudderStack, the complete customer data pipeline solution. Learn more at RudderStack.com