Digital democracy will put these campaign professionals out of work. New research in computer science, sociology and political science shows that data extracted from social media platforms yield accurate measurements of public opinion. It turns out that what people say on Twitter or Facebook is a very good indicator of how they will vote.
How good? In a paper to be presented Monday, co-authors Joseph DiGrazia, Karissa McKelvey, Johan Bollen and I show that Twitter discussions are an unusually good predictor of U.S. House elections. Using a massive archive of billions of randomly sampled tweets stored at Indiana University, we extracted 542,969 tweets that mention a Democratic or Republican candidate for Congress in 2010. For each congressional district, we computed the percentage of tweets that mentioned these candidates. We found a strong correlation between a candidate’s “tweet share” and the final two-party vote share, especially when we account for a district’s economic, racial and gender profile. In the 2010 data, our Twitter data predicted the winner in 404 out of 435 competitive races.
Why does this happen? We believe that Twitter and other social media reflect the underlying trend in a political race that goes beyond a district’s fundamental geographic and demographic composition. If people must talk about you, even in negative ways, it is a signal that a candidate is on the verge of victory. The attention given to winners creates a situation in which all publicity is good publicity.
This finding is remarkable because it doesn’t depend on exactly what people say or who says it. We measured only the total discussion and estimated each candidate’s share. It is this relative level of discussion that matters for tracking public opinion in electoral contests. Furthermore, social media data mimic what polls measure. For example, in Ohio’s 3rd Congressional District, we found that Republican Mike Turner got 65.4 percent of his district’s tweet share. In the final election, he got 68.1 percent of the two-party vote. The tweet prediction was off by 2.7 percentage points — a figure that is within the margin of error of any poll.
This finding has profound implications for the democratic process. There are many nations that remain mired in poverty and do not have the infrastructure required for extensive polling. Furthermore, these nations often have governments that are suspicious of polling and try to suppress it. For these reasons, it is very hard to monitor elections. In contrast, as long as citizens have access to the Internet, they can talk about their views in a less-restricted manner. The “grassroots” buzz found in social media can be studied, and it will reveal how elections are conducted and if the state is respecting human rights. And as with U.S. elections, even if the people who use social media are not completely representative of the public, the amount of attention paid to an issue is an indicator of what is happening in society. Important events generate scrutiny that can be measured and studied.