SYSTEMATIC DISINFORMATION: THE SPREAD OF MISLEADING INFORMATION AS A COLLECTIVE DYNAMIC ON TWITTER DESINFORMAÇÃO SISTEMÁTICA: O ESPALHAMENTO DE INFORMAÇÕES DISTORCIDAS COMO UMA DINÂMICA COLETIVA NO TWITTER

Disinformation is a worldwide problem and has been a key scholarship in the last few years. This paper contributes to the ongoing discussion on how disinformation spread on social media. This study uses a mixedmethods approach (Social Network Analysis, Connected Concept Analysis and Content Analysis) to analyze four political discussions on Twitter. The results show a structure of asymmetric polarization, in which one group (that supported Bolsonaro in the 2018 Brazilian elections) is strongly associated with disinformation spread. In addition, this study identifies a collective dynamic in disinformation spread as the volume of disinformation floats similarly for different levels of users depending on the context of the discussion analyzed. Based on these results, the idea of “systematic disinformation” is discussed.


INTRODUCTION
The spread of disinformation is a worldwide problem. There is evidence of the influence of disinformation in political events such as the Brexit referendum (BASTOS; MERCEA, 2019) and elections in several countries, such as Brazil (Recuero et al., 2020), the US (BENKLER et al., 2018), India (DAS; SCHROEDER, 2020), and European countries (LARSSON, 2019a;GIGLIETTO et al., 2020). Furthermore, in the context of the Covid-19 pandemic, disinformation played a key role in the emergence of the socalled infodemic (TANGCHAROENSATHIEN et al., 2020), especially in countries with a politically polarized context, such as Brazil (ROSSINI; KALOGEROPOULOS, 2021) and the US (CALVILLO et al., 2020).
In the context of elections, disinformation might influence voters' decisions (ALLCOTT; GENTZKOW, 2017); in the context of the pandemic, disinformation might lead to worse responses to contain the virus (ALLCOTT et al., 2020). Researchers have been tackling this issue by investigating why social media users share disinformation (HAN et al., 2020;USCINSKI et al., 2020) and how particular events might influence the volume of misleading content online (RECUERO et al., 2020;GREEN et al., 2021;HAUPT et al., 2021).
This study aims to contribute to this ongoing discussion by analyzing the prevalence of disinformation in four events and evaluating the role of different types of Twitter users in disinformation spread.

DISINFORMATION ON SOCIAL MEDIA
For the purpose of this study, disinformation is defined as false or distorted content that has the function of misleading others and is created and propagated to obtain economic or political benefits (FALLIS, 2015;BENKLER et al., 2018;IOSIFIDIS;NICOLI, 2020). Politically polarized contexts are often contributors to disinformation spread. In particular, disinformation tends to thrive in networks characterized by asymmetric polarization (BENKLER et al., 2018). This concept refers to the existence of polarized groups in which one of the two groups is strongly associated with the spread of disinformation and hyperpartisanship. In general, researchers have found an association between disinformation and right-wing or far-right ideology (BENKLER et al., 2018;CALVILLO et al., 2020;RECUERO et al., 2020;KALOGEROPOULOS, 2021).
Another important element for the spread of disinformation online is the role of influencers (SOARES et al., 2018). Politicians and other influential figures often fuel the spread of disinformation by creating cascades of information (RECUERO; GRUZD, 2019; GRUZD; MAI, 2020). That is, in many cases the disinformation content only becomes popular after some influential figure shares it online. Very active users and automated accounts (bots) also play an important role in this process, as their actions ISSN: 2763-8677 increase the popularity and visibility of the content they share (BASTOS et al., 2013;FREELON;LOKOT, 2020;PAPAKYRIAKOPOULOS et al., 2020). These users are many times involved in coordinated actions (GIGGLIETTO et al., 2020).
The elements mentioned above are particularly important on Twitter, as political discussions on the platform tend to assume polarized structures with two dense groups weakly connected between them (HIMELBOIM et al., 2017;RECUERO et al., 2017). Discussions on Twitter frequently emerge from the debate of one particular topic or important event connecting a public of users initially disconnected (BRUNS;BURGESS, 2011). Therefore the dynamics of information flow and disinformation spread tend to change depending on the discussion.

THE PRESENT STUDY
This study aims to contribute to the scholarship of disinformation spread on Twitter. In particular, this research has the objective to understand how the context and group dynamics might influence the spread of political disinformation on Twitter. To explore this issue, four political networks are explored.
These discussions happened in the context of the 2018 Brazilian Elections. This context is relevant for this study due to the highly polarized political environment of the elections and the influence of disinformation during the campaign (RECUERO et al., 2020). Three research questions guide the analysis: RQ1: Are there differences in the volume of disinformation shared by opposite political groups?
This research question is related to the idea of asymmetric polarization (BENKLER et al., 2018). By exploring this research, this study aims to understand the influence of political ideology in the spread of disinformation, particularly in the context of elections.

RQ2: What is the role of different types of users in the spread of disinformation on Twitter?
This research question is associated with the discussion about the role of influencers on Twitter (SOARES et al., 2018) and in the spread of disinformation (PAPAKYRIAKOPOULOS et al., 2020;MAI, 2020). By looking at the role of different users on Twitter, this study will contribute with insights on how disinformation spread on Twitter and which strategies might be useful to reduce the circulation of disinformation.
RQ3: How do these elements (RQ1 and RQ2) change in different events? ISSN: 2763-8677 Finally, this research question aims to explore how the context influences the dynamics of political discussions on Twitter. In particular, this research question will explore the influence of the context in the emergence of publics connected on Twitter due to a particular discussion (BRUNS; BURGESS, 2011).

DATA COLLECTION
Data was collected using Social Feed Manager (PROM, 2016) to access Twitter API. The query used to collect data for this study was the keyword "Bolsonaro" -to retrieve tweets about Jair Bolsonaro, elected president in 2018. To explore different contexts and dynamics, four events were selected. The dataset for each event is composed of tweets containing "Bolsonaro" on one particular day (24 hours).
The events selected are as follows: Sep. 28, 2018, when Veja Magazine accused Bolsonaro of committing several crimes in the past; Oct. 18, 2018, when the mainstream media outlet Folha de S. Paulo accused Bolsonaro of using illegal help hand to spread false information about his adversary Fernando Haddad on social media; Oct. 27, 2018, the day before the Election Day; and Oct. 28, 2018, the Election Day. The decision to analyze these four events is due to the different contexts associated with each discussion.
The first two events (Sep. 28 and Oct. 18) are related to denounces against Bolsonaro. The other two are related to the election day, as the day before the election is key to mobilizing voters, and the election day dataset also contains discussions about the results.

IDENTIFYING POLITICAL GROUPS: SOCIAL NETWORK ANALYSIS
For the analysis of this study, three datasets were created for each event: (1) a dataset of retweets; (2) a dataset of tweets containing links; (3) a dataset of replies. Initially, Social Network Analysis (WASSERMAN; FAUST, 1994) was used to explore these datasets (a total of 12 datasets). Table 1 provides a breakdown of these datasets. To identify political groups, a modularity algorithm (BLONDEL et al., 2008) was used in the networks of retweets. In this network, nodes are Twitter users and edges represent retweets. For the visualization of the networks, Force Atlas 2, a force algorithm, was used, and different colors were used to identify the clusters (Figures 1-4). As the objective was to focus on general aspects of the discussions, clusters containing less than 5% of the nodes were excluded.

Note: Original images (author)
Using Larsson's (2019b) typology, three networks (Sep. 28, Oct. 18, and Oct. 28) are composed of a blue zone containing two clusters (blue and light blue) and one green "coherent" cluster. The other network is composed of a blue zone and a green zone (green and dark green cluster). In Larsson's typology, "fuzzy" zones are (parts of) networks composed of interconnected clusters, in which is hard to identify borders between them. On the other hand, a "coherent" cluster is a cluster separated from the rest of the network with borders clearly defined. Based on this discussion, for the analysis proposed in this paper, it is considered the existence of two groups in the networks, one that comprises the blue zones and another one that comprises the green cluster/zone.
The networks based on tweets containing URLs are bipartite networks. In these cases, nodes are Twitter users and URLs, and edges connect a user to the URL they tweeted (Figures 5-8). In reply networks, nodes are users and edges represent a reply to another user (Figures 9-12). In both URL and reply networks, colors are based on the RT network modularity. That is, the groups previously identified are represented in the URL and replies networks. This decision was to explore how these networks mirror polarized groups from the RT network. Additionally, RT networks were used in a second step to identify groups' ideologies (see next section). Therefore, using the groups initially identified in the RT networks provides better accuracy to identify political groups in the URL and replies networks than running new modularity algorithms. Networks of tweets containing links created structures similar to RT networks -two polarized groups. This indicates that most of the links shared in each group are different. On the other hand, replies networks have a more distributed structure. This indicates that users reply to users from the same group, but also cross groups' barriers to reply to tweets from users from the other group. to the campaign and election day were used: "vot*" (vote, voting, etc) and "elei*" (election, electoral, etc).
Connected Concept Analysis is useful to identify the most frequent concepts in corpora by counting the frequency of words, aggregating concepts, and creating visualizations of the connections between the concepts. For this analysis, two corpora were created for each event, one containing tweets from the blue zone, and another containing tweets from the green cluster/zone. This method was used to identify the main ideology of each group.

IDENTIFYING DISINFORMATION: CONTENT ANALYSIS
The final step of the analysis was to identify disinformation in the tweets posted by each group. A sample of tweets and links was created for this analysis (n=3,051). The first step to creating this sample was to select the 50 most retweeted tweets from each group in each RT network (n=400). The second step was to select ten replies to each of the 20 users with the highest indegree (received the most replies) in each reply network (n=800). The third step was to select a sample of up to 20 replies from each of the 20 users with the highest outdegree (the most active in replying to other users) in each reply network (n=1,451). The fourth and final step was to select the 50 most shared URLs by each group in each URL ISSN: 2763-8677 network (n=400). The aim of this selection was to analyze different contexts of messages in political discussions.
Content Analysis (KRIPPENDORFF, 2013) was used to analyze this sample of tweets and URLs.
One single category was annotated in the dataset, the presence of disinformation. A tweet/URLs was considered disinformation when it contained distorted content, manipulated media (such as photos and videos), fabricated information (something completely made up), reproduced conspiracy theories, and/ or framed some information to lead to inaccurate conclusions. Content from the mainstream media and fact-checking outlets were used to help the classification. The coding was performed by a single analyst (the author of this paper). Although this is a clear limitation, the analyst has experience in classifying datasets of disinformation and achieved substantial reliability when coding disinformation in tweets with other independent coders -for example, Kappa=.887 (RECUERO; SOARES, 2021).

RESULTS
The description of the results is divided into two comparisons. The first one focuses on the general prevalence of disinformation in tweets/URLs of anti-Bolsonaro and pro-Bolsonaro groups. The second one focuses on the different types of users/contexts in the spread of disinformation and how collective actions might be involved in disinformation sharing.
The first analysis shows a context of asymmetric polarization (BENKLER et al., 2018), as disinformation is much more prevalent within the pro-Bolsonaro group ( Figure 24). The same is true for every single type of message (most retweeted messages, replies to, replies from, and most shared URLs - Figure   25). This means that media diet is generally different within each group. Besides, this result indicates that pro-Bolsonaro group is likely to be more radicalized, as it tends to engage with hyperpartisan and disinformation content. This result is in line with other studies that also identified a higher prevalence of disinformation among right-wing and far-right groups (BENKLER et al., 2018;LARSSON, 2019;CALVILLO et al., 2020;RECUERO et al., 2020;KALOGEROPOULOS, 2021). This result is related to the first RQ of this study (the role of political alignment in disinformation sharing). In general, the overall prevalence of disinformation is similar for different contexts of pro-Bolsonaro messages, ranging from 40% for the most shared URLs to 56% for the most retweeted messages. In the anti-Bolsonaro group, replies to the users with the highest indegree in reply networks registered a higher prevalence of disinformation (12%) compared to the other contexts (between 5% and 6%). The results related to the pro-Bolsonaro group are particularly relevant because they indicate that the spread of disinformation is not associated with one particular type of social media user. Rather, it was identified a higher prevalence of disinformation messages in the pro-Bolsonaro group in all discussions ( Figure 24) and posts by different types of users (Figure 25), indicating a collective dynamic. This result is related to the second RQ of this study (the role of different users in disinformation spread).

ISSN: 2763-8677
This collective action is further explored by looking at the prevalence of disinformation by type of users in each discussion. This is important to explore how context might also affect the volume of disinformation in political discussions on Twitter. While there is no clear pattern for the anti-Bolsonaro group (Figure 26), the volume of disinformation is much higher for every context in the pro-Bolsonaro group on September 28 and October 18, compared to the other two networks (Figure 27). The two contexts in which disinformation peaked among pro-Bolsonaro users (Sep. 28 and Oct. 18) are also discussions in which Bolsonaro had to defend himself from accusations in breaking news stories.
Therefore, these two discussions share in common a context of discursive struggle (SOARES; . That is, a discussion in which two opposite discourses fight for the hegemony of public opinion. In particular, a common element in both discussions was that pro-Bolsonaro users used disinformation to support their discourse. This result indicates that particular events might influence the volume of disinformation on Twitter, similar to the findings of other studies (RECUERO et al., 2020;HAUPT et al., 2021).
What is more important in this analysis is that the volume of disinformation increases in all contexts (for all the different types of messages) on September 28 and October 18, compared to October 27 and October 28. Therefore, disinformation spread is not only influenced by context, but it is also a systematic dynamic. This is to say that disinformation spread is not the result of the action of a few users, but rather a dynamic influenced by collective action. This finding is associated with the third RQ of this study (how the volume of disinformation change depending on the context). In addition to this analysis, it is also necessary to explore if the users in the four discussions are the same. This would contribute to the notion of a system of disinformation, in which this type of content is mobilized by similar users in different contexts. ISSN: 2763-8677 To explore this issue, a ratio of unique and total users was created to identify the prevalence of users participating in only one discussion. The score to this ratio is created by dividing the number of unique users that participated in at least one discussion by the number of total users in all four discussions (those that participated in more than one discussion appear more than once in the total users). The score ranges from 0.25 (as there were four networks) to 1, in which 1 means that all users were unique (that is, only participated once).
The ratio was 0.71 for the anti-Bolsonaro group and 0.63 for the pro-Bolsonaro group. This indicates that users within the pro-Bolsonaro group participated more often in two or more discussions.
That is, there was a core of users that were part of the pro-Bolsonaro group, possibly involved in the increase/decrease of disinformation depending on the context. Therefore, this result also contributes to the argument of a systematic dynamic of disinformation.

DISCUSSION AND IMPLICATIONS
The main contribution of this study was the identification of collective action in the spread of disinformation, which was defined as a systematic dynamic of disinformation spread on Twitter. Many studies focus on particular types of users or content on social media to analyze disinformation spread The spread of systematic disinformation also has implications in the strategies to reduce the spread of disinformation online. Many social media platforms focus on removing content from prevalent users to reduce disinformation spread. However, Sanderson et al. (2021) identified that some of the blocked or flagged messages end up circulating more than others. For example, blocked or removed messages were often shared as screenshots on other social media platforms. These are pieces of evidence that support the hypothesis of systematic disinformation, as even when platforms remove content from influencers, other users find ways to share their message and continue to spread disinformation.
This is not to say that is not important to adopt strategies to reduce disinformation spread by focusing on influencers, as studies identified that cascades of disinformation often depend on some highprofile users sharing the content (RECUERO; GRUZD, 2019; GRUZD; MAI, 2020). Nevertheless, after a high-profile user shares disinformation on social media, the organic spread of this content is dependent ISSN: 2763-8677 on the collective action of several other users -by retweeting the disinformation message or promoting a particular hashtag. Therefore, social media platforms ought to focus on the systematic spread of disinformation, either "inauthentic" (GIGLIETTO et al., 2020) or "organic", when looking for ways to reduce problematic content online.
Although the objective of this study was to contribute to the ongoing discussion about disinformation spread online by exploring the idea of systematic disinformation as a collective dynamic on Twitter, it also has several limitations. First, it focuses on a very particular moment, as the networks analyzed in this paper are related to the 2018 Brazilian elections. Future studies might explore different contexts to further understand the role of collective action in disinformation spread. Second, only four discussions were explored in this paper. Future studies might find ways to increase the number of discussions analyzed to be able to measure how different contexts might increase or decrease the spread of disinformation on Twitter. Third, the sample used in the content analysis was created based on convenience, that is, it was not a random sample. Future studies might include random samples in their analysis to compare the volume of disinformation in different discussions/groups. Fourth, the content analysis was performed by only one coder. Although the coder has experience in coding for disinformation, this might lead to biased results and errors. A more robust analysis might be performed by future studies to reduce this type of bias.

FINAL REMARKS
This study focused on four political discussions on Twitter during the 2018 Brazilian election to further understand the dynamics of disinformation spread. Key findings include: (1) the identification of asymmetric polarization, as pro-Bolsonaro users were more prevalently involved in disinformation spread; (2) the identification of how different contexts influence the volume of disinformation, as discussions based on discursive struggles increased the prevalence of disinformation for all types of messages analyzed; and (3) the identification of collective action to spread disinformation, as the prevalence of disinformation in messages from different users from the pro-Bolsonaro group followed a similar pattern depending on the discussion. Based on these findings, the main contribution of this study is the idea of "systematic disinformation", used to describe the collective action in disinformation spread on social media.