Truth in the Age of Twitter

Wellesley computer scientists are studying how we can be less susceptible to misinformation online and how to keep falsehoods from spreading

By Catherine Caruso ’10
Illustration by Rune Fisker

Winter 2020

Illustration of a woman walking through a maze of images related to news (flooding, pollution, vaccines)

Social media platforms like Twitter and Facebook have quickly become part of daily life for many of us, offering an endless conveyor belt of entertainment while we’re riding the subway, relaxing on the couch, or sitting in the waiting room at the doctor’s office. Yet this stream of social media content is a double-edged sword: Not only does it have the potential to entertain us and expand our minds, but it also reveals the limitations of how we process the information we encounter.

Social media content can trick us into believing something that isn’t true, or confirm the biases we already have, further polarizing us in our beliefs. There is perhaps no place where this problem is more concerning than in politics, where misinformation has major implications for our democratic process. But why are we, as a society, so susceptible to misinformation? How does misinformation spread? What can we do to combat it? These are the very questions that Wellesley computer science professors Takis Metaxas and Eni Mustafaraj have been studying for more than a decade, since a discovery on Twitter in January 2010 drew them into a social media mystery.

It all started with a special election in Massachusetts: Sen. Ted Kennedy had died, and Democrat Martha Coakley and Republican Scott Brown were in a battle for his vacant seat. The election, while important for the state, wasn’t expected to have much reach beyond it. And yet, Metaxas and Mustafaraj noticed something strange: The candidates were trending on worldwide Twitter, suggesting that they were being talked about by large numbers of people everywhere.

So why was this local election suddenly global news? To find out, the pair began collecting tweets about the candidates, which led to what Metaxas calls “a very shocking discovery”: Many had been produced by Twitter accounts created just days before the election. And these accounts “were tweeting the same link over and over again to increase the volume of tweets and make this topic trend,” Mustafaraj says, with accounts such as @CoakleyAgainstU and @CoakleySaidThat targeting Coakley.

After analyzing over 185,000 tweets, Metaxas and Mustafaraj realized that they weren’t from new users who were particularly invested in the Massachusetts election. Rather, many of the accounts had been created by anti-Coakley groups, such as the conservative, Iowa-based American Future Fund, with the goal of quickly spreading negative information about the candidate. In fact, nine accounts created by the American Future Fund sent 929 tweets in two hours, reaching about 60,000 people before Twitter shut them down. Plus, when people Googled the candidates, Google’s real-time search function meant that those tweets about the candidates (instead of tweets by them, as is now the case) came up at the top of the results.

Metaxas and Mustafaraj dubbed the attack a “Twitter bomb” in a paper on their findings. (The paper, “From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search,” received the WebScience 2010 Best Paper Prize and is available online here.) “We were not expecting to find anything weird with the data we collected, but suddenly we found the first set of bots that were created to influence the voters in Massachusetts,” Metaxas says.

“No one had used Twitter at that point to target an election,” Mustafaraj adds. As early as 2006, both parties were harnessing Google search to promote negative information about their opponents, and Metaxas says that given the high stakes, it would be surprising if researchers had not found such attempts to influence elections.

Their discovery left them wondering how else Twitter was being leveraged to spread misinformation to the masses. To find out, they built TwitterTrails, an artificial intelligence system based on machine learning that finds and analyzes all tweets related to a particular story going back a week in time.

This analysis is designed to reveal whether a story is true or false by measuring key metrics related to spread and skepticism: who first tweeted about it, when it began gaining momentum, who is spreading it, whether different groups tweeting about it are talking to each other, and how skeptical people are about its accuracy. The idea, Metaxas says, is that if users believe a story is true (and that often maps to a story actually being fact), they will share it widely and won’t express much skepticism about its veracity. However, if readers aren’t sure whether a story is true, they either won’t share it widely, so it won’t spread quickly on Twitter, or they will share it while saying they are skeptical about it. Developed primarily for journalists, TwitterTrails has been used to analyze around 1,800 stories on topics ranging from whether Putin’s motorcade was shaped like a phallus (it wasn’t) to whether NASA published a study that debunks climate change (it didn’t), and it is over 90 percent accurate at discerning whether stories are true or false. (For more about TwitterTrails, visit twittertrails.com.)

Despite its success, Metaxas became interested in when and why TwitterTrails fails. “The success rate is quite remarkable, but it is not enough—that was my big surprise,” Metaxas says. Perhaps the most glaring example was Pizzagate, a conspiracy theory from the 2016 presidential election alleging that Hillary Rodham Clinton ’69 and other Democrats were running a human trafficking ring in the basement of a pizza restaurant. The story was false, yet TwitterTrails thought it might be true, misled by the story’s rapid spread after it was mentioned by a pro-government Turkish newspaper eager to highlight political problems in other countries, and a lack of skepticism, since those talking about it already believed it.

‘If you create an environment [on social media platforms] … where everything looks the same and sounds the same, you create this confusion of what is the signal and what is the noise.’

—Eni Mustafaraj, assistant professor of computer science

This example highlights a key condition that affects how information spreads on social media: polarization. “In 2012, one of the shocking things that happened to me was being exposed to some research that said, if you give polarized people the same set of data, they become more polarized,” Metaxas recalls. The reason? People look selectively at the data, accepting the parts that confirm their prior beliefs, while rejecting the parts that don’t. The challenge of polarization, he adds, is that it not only relates to what information we believe, but it also has a strong emotional component, which prevents us from logically and impartially assessing new information. “If you have a polarized audience, it is much easier to spread fake news and rumors—people essentially would not listen with an open mind to what the other side is doing,” Metaxas says.

Now, in collaboration with the University of Oxford Centre for Technology and Global Affairs, Metaxas is using TwitterTrails to measure polarization within political contexts such as Brexit and the upcoming election in the United States. “We’re trying to study how polarization grows, and what affects polarization: Is it affected by only real-life news, or is it also influenced by false rumors?” Metaxas says. “I think the contribution we can make is to detect highly polarized situations in politics, and maybe figure out interventions to help people have a more reasonable dialogue.”

On an individual level, he says, one way to combat polarization is with critical thinking. However, he recognizes that this can be challenging. He suggests trying to approach claims or questions scientifically—creating a hypothesis, and finding and analyzing evidence that both supports and refutes that hypothesis before drawing conclusions. Often, he adds, people stop after finding supporting evidence, thus fueling polarization. Metaxas also cautions against being too trusting of any particular source, and of information that we take in through our senses—pointing out that memory, in particular, can be faulty. “If your data that you’re using your critical thinking to evaluate are already compromised because either your trusted sources have fooled you or your senses have fooled you, then it doesn’t matter how much logic you put in, essentially you will get out garbage,” Metaxas says. “The more people independently think about the question, the more likely it is we will arrive at the correct answer.”

Are News Sources Trustworthy?

Mustafaraj took a break from politics after TwitterTrails, but she felt compelled to dive back in when she saw social media being used once again to spread misinformation during the 2016 presidential election—but with more reach and higher stakes. “I decided that I have to go back to this because this problem of what is truth and what is not has now become even more dangerous,” she says. Her alarm stemmed not only from the current situation, but also from her experience living in Albania under a totalitarian regime. “I grew up in a society in which we, in a certain way, abolished the real truth, and we were asked to believe in a reality that was fake,” she explains, with the government controlling media, television, and just about everything else. “That marks you as a person, how easy it is to create an alternate version of reality.”

She began by interviewing people about how they decide if news sources are trustworthy. She discovered that many base their decision on whether Googling a news outlet brings up a “knowledge panel,” the rectangular box on the right side of the page that provides basic information about the outlet. The knowledge panel for the New York Times, for example, includes when it was founded, where it is headquartered, and who owns it. However, these knowledge panels can be problematic in a couple of ways. First, few local news sources have knowledge panels—only a third of the 8,000 Mustafaraj tested did —generally because they don’t have Wikipedia pages, which provide the information for these panels. Additionally, untrustworthy sources may mimic these panels to trick readers. “What we wanted to do is look at how many local newspapers in the United States had good web presences, or when you search Google, did you understand that they were a legitimate local news source as opposed to a misinformation site,” explains Emma Lurie ’19, who worked with Mustafaraj on the project, and is now a Ph.D. candidate at the School of Information at the University of California, Berkeley.

Mustafaraj then shifted her focus to the root of the problem: Wikipedia. “Wikipedia is really important because in the whole web ecosystem, it’s now one of the most trusted sources of information,” Mustafaraj says. Not only is a Wikipedia page often a top Google result, and the basis for knowledge panels, but it is also used by virtual assistants such as Siri and Alexa. “There is nothing else for free out there that has the scale that Wikipedia has in terms of information, and this unfortunately has made Wikipedia itself a target of misinformation, because if you manipulate Wikipedia, you manipulate Google and all these other things,” Mustafaraj says.

Wikipedia also has a more fundamental issue: It is edited by regular people, not professionals, and those people tend to write about their interests—so there are detailed pages for every Star Trek episode, yet few pages for local news sources. Moreover, Wikipedia is plagued by gender bias—only 10 to 15 percent of editors are women—which translates into a gender imbalance in the entries themselves.

To begin tackling these issues, Mustafaraj tapped the Wellesley community for a Wiki-edit-a-thon, part of a larger effort at multiple colleges and universities called Newspapers on Wikipedia. After being trained to edit Wikipedia, students began creating knowledge panels and full Wikipedia pages for local news sources. Mustafaraj has continued teaching her students how to edit Wikipedia, and she edits the resource herself whenever she has time—she encourages anyone who might be interested to become an editor, and start contributing. “I feel like we need to be informed citizens who participate in shaping our information ecosystem—we cannot just leave it to the algorithms and to the companies,” Mustafaraj says. “We have to fight misinformation with the facts and our engagement with the truth.”

The Complications of Fact-Checking

When Emma Lurie wrote her honors thesis last year with Mustafaraj, she explored another possible course of action against misinformation: fact-checking. She focused on Google’s short-lived feature that used machine learning techniques to check the accuracy of a claim by matching it with other information on the subject. The idea behind the feature was simple—it would fact-check a claim from a news story by cross-referencing it with other articles on the topic, linking to sources that confirmed or refuted it. However, the execution proved complicated, as the feature struggled with how to match a claim to a relevant source in certain situations, such as when a claim was made with conditional language (“vaccines may cause autism”) or when a claim reported someone else’s incorrect statement (“Jenny McCarthy said, ‘Vaccines cause autism.’”). The feature was discontinued in 2018, but despite its shortcomings, Lurie remains optimistic. “I think it’s clear to me that there’s no one solution, there’s no one thing you can do to fight misinformation as an individual consumer or as a tech platform, but that doesn’t mean it can’t get better.”

Social media platforms themselves also hinder the fight against misinformation. These platforms have enormous reach, and can be used for microtargeting, where specific groups are targeted with specific claims. Moreover, they are designed to funnel information into a slick, uniform format that can make it almost impossible to discern accurate sources from untrustworthy ones. “If you create an environment like this where everything looks the same and sounds the same, you create this confusion of what is the signal and what is the noise,” Mustafaraj explains. These challenges are only compounded by the fact that so much information on the internet is not entirely true or false, making it even harder to know what to believe. “The slew of borderline content that isn’t fully untrue and isn’t actually factual is much more concerning, much more pervasive, in our current media ecosystems,” Lurie says, and it is her biggest concern heading into the 2020 election.

For Mustafaraj and Metaxas, the current informational landscape also reveals a deeper issue. “It’s not just a matter of facts. It’s a matter of … your system of values, and how you use this system to interpret the facts in the world,” Mustafaraj says. “Looking for a technical solution to what is a cultural and societal division is going to be hard.”

So what can we as individuals do to counter biases stemming from our values when we encounter new information, or information that doesn’t mesh with our existing worldview? Metaxas thinks becoming more self-aware is a good place to start. “We need to understand ourselves, and not to think that the world is just either facts or lies. It is really the way you look at things that matters,” he says. “My ultimate goal is to try to figure out how I can help [myself], my students, the general public, understand why we believe what we believe. If we can achieve that, then it’s likely that we will not be fooled so easily.”

Catherine Caruso ’10 is a Boston-based science writer whose work has appeared in various publications, including Scientific American and MIT Technology Review.