A Comparative Algorithm Audit of Conspiracies on the Net: Results

cover
26 Apr 2024

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Aleksandra Urman, She is a corresponding author from Department of Informatics, University of Zurich, Switzerland;

(2) Mykola Makhortykh, Institute of Communication and Media Studies, University of Bern, Switzerland;

(3) Roberto Ulloa, GESIS - Leibniz-Institut für Sozialwissenschaften, Germany;

(4) Juhi Kulshrestha, Department of Politics and Public Administration, University of Konstanz, Germany.

Results

Prevalence of conspiratorial information in web search results

We present the results of our analysis on the prevalence of information with different stances towards conspiracy theories (RQ1) per engine across all locations, collection rounds and queries in Figure 1, and disaggregated per query-engine-location in Figures 2-7 (Figures 2-4 correspond to the observations from March 2021; Figures 5-7 - from May 2021).

Location-based and temporal differences

The shares of content with different stances towards conspiracy theories vary widely across SEs and queries, but less so across different locations and waves. These observations indicate that our findings are not highly dependent on a specific location or time frame - at least when it comes to English-speaking locations and periods of observation that are relatively close to each other.

Engine-based differences

The only search engine that consistently did not return links promoting conspiracy theories is Google (see Figure 1), except for small shares of content promoting conspiracies in relation to the “new world order” query in May for the two US locations (Figure 5 and 6). The share of conspiracy-debunking content on Google was similar to that of other SEs for the most queries. At the same time, Google had the highest share of content that did not mention conspiracy theories at all, thus its users were less likely to be exposed to any conspiracy-related information.

The search engine with the highest proportion of conspiracy-promoting content was Yandex, while the remaining three engines - Bing, Yahoo and DuckDuckGo - were somewhere in-between Google and Yandex, with Yahoo having lower shares of content without mentions of conspiracy theories compared to the others but the highest proportion of conspiracy-debunking content. There were minor variations between the three with Yahoo having slightly lower shares of conspiratorial content. The similarities between Bing, Yahoo and DuckDuckGo are, perhaps, to be expected since Yahoo and DuckDuckGo are partially powered by Bing’s search algorithms and results.

Figure 1. Prevalence of results with different stances towards conspiracy theories per engine across all queries, locations and periods.

Query-based differences

We observed evident discrepancies in the shares of content with different stances towards conspiracy theories across queries. Results for “9/11” contained the lowest shares of conspiracy-promoting content across all engines except Yahoo where, depending on the location and collection round, the share of conspiratorial content - e.g., that 9/11 was an “inside job” - reached close to 50%. No search engine returned results debunking conspiracy theories related to 9/11 but that arguably can be connected to the data void on this - i.e., there is content promoting either the official version or the conspiratorial one, with the former stating facts, but not debunking the latter.

The queries “george soros” and “new world order” are the other two queries for which relatively high shares of content that does not mention conspiracy theories were returned. For both queries,

we observed comparatively low (<20% for “new world order” and <30% for “george soros”) shares of content promoting conspiracy theories on all engines except Yandex. There is also a difference between the results returned for these two queries. For “george soros”, more results debunking conspiracy theories were returned, with no results that would simply mention the theories. For “new world order”, we observed few debunking results but relatively high proportions of results that simply mentioned the theory - i.e., that there is an emerging secret totalitarian world government - without a clear stance towards it.

Finally, the queries that attracted the highest shares of conspiracy-related content were “illuminati”, “flat earth” and “qanon”. For “qanon” we retrieved mostly content debunking the theory - with the exception of Yandex. Bing, DuckDuckGo and Yahoo contained comparatively low (up to 25%) proportions of conspiracy-promoting content. There was no content without any mentions of the conspiracy theory on either of the engines, which is perhaps to be expected given that “qanon” is a rather unambiguous term. With “illuminati” and “flat earth” the shares of conspiracy-promoting content were high on all SEs (except Google). Yandex, like with other queries, contained the highest shares of conspiracy-promoting links, whereas Bing, DuckDuckGo and Yahoo, depending on the location and the wave, displayed between 25% and 50% of conspiracy-promoting results. While with “flat earth” information debunking the conspiracy - i.e., listing arguments why the Earth is not flat, - was commonly displayed across all SEs, for “illuminati” conspiracy-debunking results were much less prevalent (except for Google).

Some of our expectations outlined in the “Methodology” section were not supported by the analysis. Notably, the search results regarding “illuminati” (a subject surrounded by many conspiracy theories) contained more conspiracy-promoting information than the results for “new world order” (query naming a particular conspiracy theory). Qualitatively, we have established that the non-conspiracy-related results displayed for “new world order” were referring to this concept in the context of international relations (e.g., Slaughter’s (2012) article in Foreign Affairs that was featured in Google’s search results). In the case of “illuminati”, we attribute the high shares of conspiracy-promoting content to the wide spread of the related conspiracy theory and frequent references to it in popular culture (e.g., Dan Brown’s books).

Figure 2. Prevalence of content with different stances towards conspiracy theories per engineand query, California server, March 2021.

Figure 3. Prevalence of content with different stances towards conspiracy theories per engineand query, Ohio server, March 2021.

Figure 4. Prevalence of content with different stances towards conspiracy theories per engineand query, UK server, March 2021.

Figure 5. Prevalence of content with different stances towards conspiracy theories per engineand query, California server, May 2021.

Figure 6. Prevalence of content with different stances towards conspiracy theories per engineand query, Ohio server, May 2021.

Figure 7. Prevalence of content with different stances towards conspiracy theories per engine and query, UK server, May 2021.

Prioritization of different types of sources in search rankings

In relation to RQ2, we first present the results of the analysis of the shares of sources of different types across all queries-locations-waves (Figure 8), and then disaggregated per engine-query-location-wave combination (Figures 9-11 correspond to March 2021; Figures 12-14 correspond to May 2021). Finally, in Figure 15 we present the share of content with different stances per source type, aggregated across all SEs and queries for each of the two data collection rounds.

Location-based and temporal differences

Similar to the observations on the search results’ stances towards conspiracy theories, we did not find major differences across locations and collection rounds. The only obvious difference relates to the prevalence of scientific sources: their share was higher in the UK than in the two other locations, and it was higher in March 2021 than in May 2021.

Engine-based differences

Our observations regarding the source types partially echo the findings on the content with different stances towards conspiracy theories reported above. Specifically, Google is the only search engine that did not return links to conspiracy-dedicated websites, while Yandex had the highest share of such links (Figure 8). Additionally, Google’s results contained the biggest proportion of links to scientific sources, while those were absent on Yandex. In turn, Yandex was the engine with the highest share of links to social media in the results. Bing, Yahoo and DuckDuckGo contained similar shares of links to sources of different types with some differences for specific queries (e.g., Yahoo had fewer results from the media in response to “9/11” query than the other engines).

Figure 8. Prevalence of different source types per engine across all queries, locations and periods.

Query-based differences

For all queries except “flat earth” the outputs of SEs contained rather high shares of links to media and reference websites (e.g., Wikipedia pages), which is in line with what previous research found for queries related to COVID-19 (Makhortykh et al., 2020) or elections in the US (Kulshrestha et al., 2019; Urman et al., 2021). There were, however, some query-based discrepancies. For instance, for “qanon” the results had the highest shares of links to media, perhaps due to the intense media coverage of QAnon at the time of the data collection . The variations in the distribution of conspiracy websites across engines were similar to what we observe with regard to conspiracy stances: “flat earth”-related results had the highest shares of conspiracy websites, followed by “illuminati” and “qanon”, while for the other three queries conspiracy websites were less prevalent.

Figure 9. Prevalence of different source types per engine and query, California server, March 2021.

Figure 10. Prevalence of different source types per engine and query, Ohio server, March 2021.

Figure 11. Prevalence of different source types per engine and query, UK server, March 2021.

Figure 12. Prevalence of different source types per engine and query, California server, May 2021.

Figure 13. Prevalence of different source types per engine and query, Ohio server, May 2021.

Figure 14. Prevalence of different source types per engine and query, UK server, May 2021.

Source types vs stances towards conspiracy theories

With regard to the shares of content with different stances towards conspiracy theories across source types (Figure 15), the observations within the two periods are rather similar, once again indicating the robustness of the findings against (short-term) temporal changes. Expectedly, all conspiracy-dedicated websites contained only information promoting conspiracy theories. Additionally, conspiratorial information was often present on social media that aligns with existing research on conspiratorial content spread online (e.g., Bessi et al., 2015; Stano, 2020). Among all other source types, the share of conspiracy-promoting content was the highest among websites in the “other” category. This included, for example, links to the webpages of individual books dedicated to specific conspiracy theories on Amazon webstore. There were very few media and reference sources that promoted conspiracy theories; instead, both source types predominantly either mentioned conspiracy theories or provided information unrelated to conspiracies. Media websites also contained high proportions of conspiracy-debunking content but the highest share of such information came from scientific websites.

The minor share of conspiracy theory-promoting information coming from scientific sources (see March 2021 observations in Figure 13) corresponds to one link to a peer-reviewed article from an academic journal in which the author postulated that International Relations scholars should question the official version regarding the September 11, 2001 attacks and backed this statement with the arguments coming from the “9/11 truth movement”[3]. For ethical reasons, we refrain from citing that particular paper here to not promote the conspiracy further.

Figure 15. Prevalence of content with different stances towards conspiracy theories per source type (aggregated across all search queries and engines for each wave).


[3] “9/11 truth movement” refers to loosely connected individuals and groups that support conspiracy theories questioning the official version of the September 11, 2001 attacks (see more on the movement and related theories in Wood and Douglas, 2013)