To understand what makes BuzzFeed tick, you need to know how Dao Nguyen thinks about data.
As the publisher in charge of BuzzFeed, which has annual revenue in the hundreds of millions of dollars, you might expect Dao Nguyen to be getting the best tables at fancy restaurants in order to land advertising deals with chief marketing officers. Instead, Nguyen meets me at a Le Pain Quotidien cafe wearing a grey fleece with the Dow Jones logo on it. She’s every bit the down-to-earth geek who you’d expect to be building BuzzFeed‘s technology and data infrastructure.
It turns out that BuzzFeed founder and CEO Jonah Peretti bestowed the title on her based on a much older definition of a publisher’s role. “Traditionally that meant owning a printing press and dealing with delivery trucks and newsstands,” Nguyen told me. “Whereas with digital media, getting your content to the public is all about your technical platform, your distribution plans, on social networks or other technical platforms.”
For BuzzFeed, the newsstand (and sometimes even the printing press) is your social feed, and its delivery trucks are you sharing a story. The digital circulation for a piece of content is constantly being monitored and communicated back to the organization through dashboards, emails, and Slack.
“What is the competitive advantage that you can gain as a publisher today?” says Peretti. As the value of content approaches zero, “Having technology, data science, and being able to know how to manage, optimize and coordinate your publishing is the thing that gives you a competitive advantage.”
Here are some of the highlights from the interviews with Nguyen I collected as I reported this month’s cover story on Buzzfeed‘s growing media empire.
Dao Nguyen: If you look at the word “publishing,” actually meaning making content available to the public, it used to be you had to have all these things in place, including advertising. You no longer need those things. Making content available to the public is entirely a technical talent.
Noah Robischon: Have you ever thought you would want to be a publisher? What did you think you would be?
I always wanted to work in computers. I’m going to be 42 next week. I’ve been coding since I was seven. I’ve always loved programming and working on computers.
What was the first thing you programmed?
When I was 7? The first program I was really excited about was a low-res stick figure doing jumping jacks. Animated, but totally low-res, not high-res graphics. Basically, he was doing jumping jacks. I remember in school I had those printouts with those weird white strips with the little holes on the sides. I had this printout of this program I was working on and I was trying to debug it. I looked at it in class under my desk and the teacher says, “What are you doing?” She comes over and sees me reading this computer printout. It was my program. She said, “Did you write that?” And I said, “Yes.” So she then asked me to program a thing that quizzed students on state capitals.
Jumping jacks. Fascinating.
But I’ve never been super ambitious, actually.
In what sense?
When you say ambition, what do you consider ambition?
I’ve never wanted to start a company. I always knew it would be incredibly difficult. If you start a company, a lot of what you’re doing is non-technical. It’s advertising. I’ve had this amazing career, and it’s difficult to explain why. [Laughs]
When I worked in France for a long time, I eventually became the CEO of the Internet subsidiary of the newspaper [Le Monde]. I refer to that whole period as, I was the accidental CEO. Coming around and suddenly I was there and there it was.
How did you end up in France?
That’s a fun story. It was 2000 and the Internet bubble was bursting in New York. I was working for an Internet start-up and I was having to fire all of my friends. I said, “I don’t need this, why am I doing this?” I decided I wanted to learn French, so I said to my then boyfriend, now husband, “I’m going to quit, move to France, eat cheese, drink wine, and sit out the Internet recession for a year.” He’s like, “Great! Let’s go.”
I ended up getting this job at Le Monde IT as a technical project manager. I was like, “I can do this job in my sleep, I just don’t speak any French, that’s why it will be a big challenge for all of us.”
I signed a one-year contract. In France, it’s very hard to get hired because most people want a permanent, lifetime, un-fireable contract. I don’t want that contract, I want a one-year contract. They were happy to have me and after the year they switched me over to another contract.
Was that the time you started to understand how media worked?
Was there a particular moment where things clicked into place? Where you understood both what was wrong and how to fix it? Was it a more slow, testing experimentation that brought on each insight along the path?
I had a lot of great mentors when I was there who had thought a lot about news and news consumption. One of the things I learned at Le Monde I think is still true today is news consumption is different from consuming other products. It’s not like one day someone will just wake up and go, “Today, I want to be informed,” like the way of: “I want to have chocolate chip ice cream”, they wake up, “I want to wake up and have ice cream today,” so they go out and buy ice cream. News isn’t like that. Nobody just wakes up and goes: “I want to be informed. I’ve never been informed before, I was informed in the past, I was informed a bunch a couple of years ago, it was pretty cool, maybe I’ll be informed again. I’m going to go out and purchase something or do something to inform myself.”
No. It’s a habit. It’s a person’s identity. The thing that will shape the most about how you are informed at all today is how you were informed yesterday. It’s a habit. If you read The New York Times front page every day to get informed, you would probably read it tomorrow to become informed.
Thinking about actual people and how they think about news is something I learned when I got started in this industry in France. It’s important, especially when people work in data, “Oh, you’re a good unique visitor.” People forget that what you should be thinking about is the person who represents that, and what are they doing. That is an example to me.
Is data the hub for these spokes of the company or do you look at it a different way? Describe how you see the way data interacts with the different pieces.
I think that’s a good question because I think it’s a strange thing. Depending on who I talk to, sometimes I say to people, “[Buzzfeed] uses data much more than you think.” And then depending on the person, sometimes I’ll say, “No, no, no, it uses the data much less than you think.” I think there are some myths. One myth is data scientists are telling reporters what to write and what to cover. That’s totally a myth. I’d like to dispel it at every moment I can. That’s totally untrue. I take no responsibility for what these insane reporters cover. They just come up with all that themselves.
I assume that people look at a Buzzfeed story that did well about “These Are 27 Sandwiches That Are Better Than a Boyfriend,” and think there must be some deep data science behind sandwiches, and sandwiches and boyfriends, right? Actually that requires a creative mind more than anything, you know?
That myth stems from people’s desire to have a black and white explanation, a simple explanation. The reality is that things are more nuanced than you would like them to be, and more complicated than you would like them to be. And so it’s the easy way out to have a very sort of simplistic view. The key is, when I speak to editors and people in general, they have a very healthy view of data. They understand there are many things data can tell them. But they also understand there are many things data can’t tell them.
You have to use a lot of intuition and a lot of creativity, and the data is one part of the input you take in to think about why this could do well, why do people share it. The data never tells you why anything happens. Data will tell you, if you’re very lucky, what happened. It won’t ever tell you why. If you want to understand why, that requires a different set of skills, largely in your brain and in your heart. Why did this story resonate with people?
Reading comments is often a very good barometer—you can’t only use comments, you can’t only use data, you can’t only use anything. You can’t only use your own intuition, either. It has to be all of those things you use. When talking about things, “Oh, maybe it’s this. Maybe it’s that.” Then we can test it. “Let’s test whether or not this hunch I have is right based on something I’ve seen out there.”
Which is why for us, publishing volume is actually really important. It’s not that we want to crank stuff out there for no reason at all. The more you publish, the more opportunities you have to look at things that are happening, read comments, have a new hypothesis, test a hypothesis. And if you can do that relatively quickly, then you remember what you were testing. Two weeks go by and I haven’t touched a thing, “What was that thing I was trying to test?” But if you’re publishing every day and get a lot of signals that are both quantitative and qualitative, and anecdotal even, you can begin to form ideas about content. How it should be made, how it should be presented, and where it should be distributed and whether or not that has an effect.
There’s data, which is quantitative. Then there’s qualitative information you can gather.
Reading comments, reading tweets, reading articles about your article—all of that is qualitative. I feel like the third part that is necessary, critical, is the culture encouraging all that. That, in many ways, is one of our biggest competitive advantages. Our staff and our culture is one that encourages this, and praises it, and has a pretty healthy appreciation of data as well as a healthy appreciation of other things, like intuition.
The stereotype of a traditional reporter is, “Only what I think matters and what I think is important matters and I’m not going to look at any other signals.” And that’s, I guess, one kind of intuition. But the humility that comes with, “Oh, I’m just learning about my audience, learning about what is interesting.” That is something we actively seek out in people.
I don’t think that Buzzfeed has the monopoly on data. I just think we use it well.
Given your curiosity about the human condition, how does data help you understand the human condition?
I think data helps people affirm, deny or continue to explore hypotheses about the human condition.
You said confirm, deny or continue to explore. Interesting. You could also develop social science. There are many methods of doing that.
Yes, but at scale.
Most people don’t think of data being able to do that. Let me put it this way: who else out there is using data the way you think about it?
Probably Netflix. Like I said, I think we’re still at the beginning. We still have a pretty rudimentary apparatus in place, and it’s okay because you also want the creative people on the other end to realize it’s just one input. No one’s a slave to it. A lot of it grows out of the fact we have people who have grown out of the video side, people who used to make YouTube videos. If you make a YouTube video, you immediately get feedback: how many comments, what they said, how they liked it, whether it was shared. Talent that is emerging now is already very familiar and comfortable with the idea you receive these signals back and it tells you something or suggests something to you.
I think our competitive advantage is having a pretty rounded view of that, and not making it out to be some sort of magic solution and getting all wrapped up in it.
There’s always a curve. There’s always like very few posts that get a lot of traffic. That’s totally great, actually, because it means if we didn’t have enough posts that failed, it means we’re not trying enough things. I think that’s—one of the first all-company presentations I did when I came toBuzzfeed, a long time ago now, I guess three years ago, is called the Dot Presentation. People still ask me about them. I just took all of the posts and I bucketed them into traffic buckets and the size of the dot was the number of posts were in each bucket. What I showed was over time, the size of the dots started to increase and got higher. There were more posts getting more traffic. But the reality is the super viral ones, like the million-plus ones are always going to be very small.
I said, “That’s okay,” because our sweet spot is actually in this other bucket, 100-250k, that’s our sweet spot, that’s going to allow us to make posts for the next bucket and that allows us to make posts for the next bucket.”
How did you figure out that was the right bucket versus the bucket two rungs up?
You look at it over time. “Oh here it is,” for each month you can see how it changes. It wasn’t the biggest bucket, the biggest bucket was the failure bucket, the bucket with no traffic. But that’s fine too. I’m not embarrassed to say it. The bucket that’s shared pretty well, did pretty well, wasn’t meant to be viral, but still performed pretty well, it was a solid performer. Don’t let all the attention get given to these viral ones. The attention should be on the bread and butter.
Why should it be?
That’s where you’re learning.
Why do you learn more from those than the mega-viral hits?
There’s more of them, the sample size is higher.
I have a good example about that, because it’s something that I was involved in personally. The first post I wrote on Buzzfeed was called “27 signs you were raised by immigrant parents.” It was published two years ago now, so I feel kind of terrible still talking about it. The point of it was it was incredibly viral, got like 2 million views—it got like 1 million views in the first 12 hours. Two and a half years ago we were a very small site, so it was a big deal. It wasn’t the first post that we ever wrote about having immigrant parents, there were previous posts called “Signs you were raised by immigrant parents.” There was one that was “Signs you were raised by Pakistani immigrant parents.” There were many versions that all did pretty well, but this one blew them out of the water. That’s because the concept was piggybacking off other people’s work. The whole post was gently mocking your parents. Like “Your dishwasher is only used to dry dishes, not wash them.” Or “Your mother is always telling you you need to wear a sweater.” And then the very last one, number 27, it was the opposite. It was much more like, “You realize your parents sacrificed so much to bring you to this country and you wouldn’t change it for the world.” You love them. Sort of the opposite of everything. Then you read the comments. The comments were like, “I was laughing so hard until I got to number 27 and now I’m crying.” Or “Number 27 made me send this to my parents.” Many of these comments were basically saying, “Ha ha ha, BRB crying.”
Without any official communications, editorial style, people immediately started employing this technique. No one said anything. Everyone read the post, they read the comments. It had the sticker at the end that made you want to share it with someone. When you share it then it makes you look good because you’re making fun of your parents and laughing with your siblings, or sharing with your boyfriend or girlfriend, “This is my life,” but also, “I love my parents.” That’s something you learned from the comments. If you didn’t read the comments, it was like, “Asians share a lot. More than the Pakistanis!”
We track all of our Facebook activity, obviously, and all of our page activity—we have 90 Facebook pages, that’s insane. We track all of the posts, the stats they generate. We can look at traditional things, like when is the best time to post? And how does using video for certain Facebook pages affect fan growth? And how the rates are different between pages. We can use that to optimize what pages we post videos on. And then how it gets re-shared by bigger pages. Like, how do you use a big page to grow a small page? What media do you want to use? Why do some fans on some pages seem to respond better to videos versus other pages? What is the breakdown? Is it a demo breakdown?
We work really closely with the social team, which is in edit, to talk about what we think is happening… It helps because it’s a really direct feedback loop that is not currently supported by Facebook’s tools.
Are you using Facebook insights to get the raw data?
Yeah, we call the APIs. We don’t normally compare the pages to each other but more like this happened on this page—sometimes we’ll compare that but if it’s really the same content, like this same thing was posted to multiple pages. We’ve grown a lot of small pages into bigger pages. How can we do that? Can we replicate that all the time? That requires a lot of understanding that there are many questions that data can kind of give pointers to, but all you can do is try things out. It’s nuanced, there’s no magic formula, a lot of it is based on good content.
Dan Oshinsky, our newsletter editor, reports up to edit. But when he started he reported to me. That was always the deal. You come in, you report to me, we make a product together, talk to you about data. You’re still obviously working with edit, make sure it’s the right voice. But you work for me and when we feel that you’re ready, then you’ll move over to edit. Newsletters data—we use Campaign Monitor, and there are some tie-ins to the analytics product but not that many. From a data perspective, he is charged with and is free to interpret the data in the way he feels will most improve his product and improves his readership.
Is he looking at click-through rate?
For a long time, it was: you want to get subscribers up, you want to get clicks up, you want to get unsubscribes down. But one of the things we talk about all the time is there is no one metric you are optimizing for. Anyone who just optimizes to one metric is going to eventually have a problem. This obsession over time spent. In some way I feel that sort of rhetoric has died down. There really is no one metric.
On the one hand, you want to go up. You don’t want it to go up and have all the other numbers decline. We will go and routinely purge our list. If you don’t open the newsletter for X number of months, then you get an email saying, “You’re going to get removed from this list unless you opt in in the next 24 hours.” When that happens, newsletter numbers go down. I don’t need newsletters subscriber numbers to be up for the sake of being out. I want to actually get people to look at the stories, to read them and share them.
So his subscriber rate would go down, but his clickthrough numbers would go up, and then his clicks back to BuzzFeed would remain flat because you haven’t changed anything, right? [Laughter]
But your data is better.
One of the things Buzzfeed has done really well—and I’m sure this was all Jonah’s plan from the beginning—but I feel like the business side and editorial side are really aligned. They’re really aligned because we don’t sell banner ads in newsletters—we do monetize the newsletter, we just don’t sell banner ads. The idea that you’d like the subscriber numbers to go up, so you can sell it at a higher rate, is not clickable because neither side cares about that, right? Because native advertising is about something else, it’s about getting people to look at the actual sponsored posts or branded video that we have made. And also on the editorial side, the goal isn’t to just be in a bunch of people’s inboxes, it’s to go look at the content and learn something about it or cook something or whatever. They’re actually aligned and I feel like we’re so fortunate in that. Because in so many media industries the two are not aligned.
That’s what makes metrics and data more complicated at those organizations. One set of people feel that one metric is important and the other feels it’s not important, or is much less important.
They’re almost adversarial.
Sometimes. I feel like the role of data we have is a luxury. A luxury out of the fortune or genius, this business genius, this vision where you don’t have that tension. Where both sides are trying to achieve the same thing. Sometimes I feel badly for organization’s where the data team is caught in the middle and can’t, and they don’t know how to talk about it to their constituencies because the constituencies have such diverging values. I feel like mostly my job is easy, but it’s easier because of the way that’s set up.
I feel that is underreported. The fact that native advertising is a better user experience, that’s reported on. But the thing that is not reported on is that native advertising aligns groups within the organization in a way that makes everyone more effective, and advertisers happy.