Distorted Reality: Developing Artificial Intelligence Approaches to Expose Biased News Coverage

News should provide an accurate and impartial overview of current events to enable news consumers to derive well-informed views that shape their engagement with societally relevant topics. There is abundant evidence though that this goal is virtually always missed, at least in part. One of the latest events that have put a spotlight on the negative influence that news, the Web, and social media can exert on public opinion is the societal polarization in the US surrounding the 2020 presidential election. It peaked in a mob storming the US capitol building demonstrating how drastically slanted media coverage can impact political and societal processes. Topics like “fake news”, “echo chambers”, and “disinformation” increasingly dominate the debate about events like the US presidential elections 2016 and 2020, the Brexit referendum, and the COVID pandemic. This post provides insights into the role that biased news plays in this context and our ongoing work in the context of our interdisciplinary WIN Project on “Fake News and Collective Decision-Making.”

Since the 1950s, researchers in the social and behavioral sciences have investigated intentional and sustained media bias, i.e., systematic tendencies of deliberately slanted or opinionated news coverage. Like most researchers, we usually abstain from referring to the phenomenon as “fake news” as this term inadequately describes the larger problem of biased media coverage and increasingly carries a derogative connotation. Donald Trump and his supporters, in particular, have frequently used it to delegitimize accurate news coverage, with which they disagreed.

Instances of fake news coverage, i.e., deliberately false reporting are quite rare. However, news are virtually never unbiased with tendential coverage arising from a host of factors, including news producers’ political, ideological, and economic interests. For example, investigations of the Brexit referendum show that British newspapers belonging to the media conglomerate owned by Rupert Murdoch expressed explicit and sustained support for the campaign to leave the EU. In the US, six corporations control 90% of the media. Rupert Murdoch’s media group is among them and, besides many other outlets, owns Fox News which emphatically supported the presidential election campaigns of Donald Trump in 2016 and 2020. Recent, large-scale investigations of mass media reporting during these two election campaigns find conclusive evidence that the symbiotic relationship of Donald Trump’s campaign and the Fox News corporation was far more influential than the disinformation campaigns of Russian Internet trolls and other social media actors. It is not the first time that Fox News has been proven to exert a sizable influence on public opinion. A 2003 survey showed that Fox News viewers were most misinformed about the Iraq war. Over 40% of the viewers believed that weapons of mass destruction had been found in Iraq, i.e., believed in the US government’s justification for the war, even after the US troops had failed to find any such weapons.

News outlets are predominantly commercial enterprises, hence not immune to economic incentives. For decades, media outlets have been found to align news stories with the interests of their business partners such as advertising clients, e.g., by not reporting on events and information that reflect negatively on these partners. Favorable reporting on tobacco consumption is a well-documented historical example of such bias. Another economically driven source of potential bias is the increasing pressure to reduce reporting costs due to the steady decline of revenues from paid journalism. Investigative journalism is more expensive than copyediting prepared press releases—a now very common practice among most news outlets.

bias motives

Fig. 1: Motives for and forms of media bias introduced in the news production process.

Tailoring stories to target audiences is another source of bias and research has uncovered its influence on the reporting across a wide variety of news outlets, including some of the most prominent and reputable organizations. Readers may, of course, switch to other news outlets if the current outlet’s reporting contradicts their beliefs and views. The tendency to search, interpret, recall, and favor information that supports prior beliefs is a well-known psychological bias that tends to further solidify existing polarization in news outlets and their readerships. At the same time, individual journalists can also introduce bias in a story, e.g., to advance their careers. In Germany, the case of Claas Relotius, who exaggerated or invented information to make his stories more sensational, is a recent example of such bias.

Bias arises throughout the stages of the new production process shown in Figure 1. In the first stage, gathering, journalists need to select the events and the facts about the events they report. Naturally, not all events can be covered, and some are more relevant to the target audience than others. Often, sensational stories yield more sales. Next, journalists need to select sources, e.g., press releases, other news articles, or studies, to be used when writing an article. Ultimately, journalists must decide on the information from a source to be included and excluded from the article. This step is called commission or omission and likewise affects which perspective is taken on the event. In the next phase, writing, journalists may use word choice and labeling to bias news. An example of different word choices is whether the article refers to “coalition forces” or “invading forces”. Similarly, an event, action, or attribute can be labeled positively, e.g., as a “simple but genius method”, or negatively, e.g., as an “unsophisticated approach”. In the last stage, editing, presentation choices, such as the placement of the item, the size allocation, and picture selection, influence an event’s perception. A large cover story receives more attention than a brief comment on page three. An emotional article picture and the chosen caption can significantly alter the perception of an event.

Scientific evidence suggests modern information and communication technology and particularly social media have amplified the effects of biased news coverage but did not change its root causes and principle workings substantially. Despite the rise of social media, news articles published by established media outlets remain the primary source of information on current events. However, today these traditional news items tend to be re-shared much more frequently within news consumers’ social networks. Despite the immense availability of news sources on the Web, most news consumers only consult a small subset of available news outlets. The reasons include information overload, language barriers, the consumers’ specific interests or habits, or simply because others share articles from certain outlets within the consumers’ social network. The notion that novel actors, such as agents paid for lobbying in favor of a specific view or pop-up online media outlets independent of established mass media, drive the noticeable surge in disinformation is not supported by contemporary research. While these novel actors exist, the influence of established media outlets still outweighs theirs by far. For political topics, researchers for example have observed a stronger tendency of social media users to reinforce their existing biases by only surrounding themselves with news and opinions close to their own, thereby isolating themselves into “echo chambers”. The automated filtering methods of social media platforms, which ought to customize the content shown to the user according to the user’s assumed interests, tend to further strengthen these effects. The algorithms analyze users’ engagement with content and seek to expose them to similar content and topics, thereby creating “filter bubbles”.

These phenomena are well-researched; therefore, more recently attention has turned increasingly to how biased, one-sided news coverage and its negative effects can be countered. Researchers from the social and behavioral sciences have studied media bias extensively for decades. Their work has produced suitable models to describe media bias and comprehensive methodologies, such as content analysis and frame analysis, to identify it. However, this research has traditionally relied on in-depth qualitative work or quantitative analyses featuring manual annotations. These approaches can be applied to analyze and document media bias in hindsight but offer little help for identifying it in real-time for current news, which would require automating the analysis. The website AllSides pursues an alternative strategy by collecting readers’ ratings on the political leaning of news outlets which it aggregates into a media bias rating from left-wing over center to right-wing publishers. The website presents articles on the same topic published by outlets from the left, center, and right categories of their rating to counter the effects of one-sided reporting. Researchers proposed a few approaches and systems to partially automate media bias identification, but none are still available today. In many cases, the underlying models of media bias were too simplistic, causing their results not to provide additional insights compared to models and results of research in the social sciences.

bias vizualization

Fig. 2: Experimental visualization of media bias. Note the different focus and tone of each article.

With our interdisciplinary team consisting of computer science and social science researchers from the Universities of Wuppertal, Zurich, and Konstanz we seek to improve this situation. We tackle the automated identification of media bias by incorporating the extensive expertise established in the social sciences into state-of-the-art artificial intelligence (AI) approaches. The goal is to create a website that collects and aggregates news articles from the Web, employs AI methods to identify media bias, and visualizes the bias-induced differences in news articles for the readers. Figure 2 shows one of several possible bias visualization for the envisioned system the team tested in a pre-study.

Realizing a system to automate media bias identification is a tough problem as the AI methods need to learn two major capabilities challenging even for humans. First, they need to recognize commonalities and differences in content to identify potential bias by omission or commission of information or the selective use of sources. Second, and even more complicated, they need to recognize subtle differences in language that can point to bias by word choice and labeling and generally framing, i.e., which aspects an article highlights when reporting on a topic. Along the way, the AI methods need to master several sub-tasks, such as reliably recognizing coreferences, i.e., different mentions of the same entities within and across articles. An example is recognizing that “caravan of illegal immigrants”—“a marching group of immigrants”—“people seeking opportunities in the US” refers to the same group of people in different articles. The team developed an approach that analyses a multitude of linguistic properties, syntactic relations, and contextual clues in a multi-step process to enable AI models to recognize such complex references.

One of the first steps towards enabling AI methods to recognize bias by word choice and labeling is creating ground-truth datasets required for the two tasks mentioned previously. While current AI models achieve reliable natural language comprehension, their training requires large amounts of high-quality training data. For this purpose, the team created a data set that will be used to train an AI-based method for coreference resolution and is currently compiling a second data set to train a method for the identification of topic-independent frame types. The team will annotate frames such as economy, morality, and fairness, both on the article and sentence level. Using these data sets, the AI models can then learn to recognize the characteristics of difficult-to-detect coreferences (see the previous examples) and subtly biased word choice as well as the framing resulting thereof. Once this step is completed successfully, the team will investigate applying the approach to languages other than English.

With a functional prototype in place, we are currently pivoting within the project to testing how to then best communicate the biases we recognize to news consumers. This is, ultimately, as important as reliably detecting them and we are currently running a broad, systematic evaluation of different concrete vizualization approaches for how biases are perceived by readers.

This blog post has been adapted from an article originally published in BUWOutput Issue 02/2021, the Research Bulletin of Bergische Universität Wuppertal. It expresses the authors’ views only and not the institutions they are affiliated with.