Gab has always been very clear that we are not an anarchist website or an “unregulated” zone. Our rules are grounded in American speech and privacy laws. They are no different than the rules Twitter, Facebook, and others used for a decade to get as big as they are today before starting to censor based on political viewpoint.
A new academic paper is out today that starts with the false premise that Gab is “completely unregulated” and claims that we market ourselves this way. Given that the methodology used is also highly flawed and the initial premise is categorically and provably false, why should anyone in their right mind take this paper seriously? All these “scholars” needed to do was ask us or simply read our publicly available terms of service, which spell our our rules and regulations explicitly.
The data pulled from the study has not been verified by us and the timeframe from which it was pulled was during the earliest days of Gab when we were still in invite-only beta testing. Our community has since grown to over one million users from around the world from all different walks of life.
Flawed Data Collection
Many crawled ids did not have a message, perhaps because the messages were deleted. We found a large number of missing messages, with peaks in the number of missing messages around December 2016, August 2017, and September 2017. This may suggest that Gab removed certain accounts or specific messages.
First, the academics admit that they were scraping our site. We actively stop scrapers on a daily basis and have measures in place to detect and remove them. They suggest that much of their data did not have any messages and then speculate that Gab removed the content.
They fail to realize that users frequently delete their own content and had the ability to mass remove it themselves in bulk during the timeframe of data they analyzed. People also have the ability to delete their accounts entirely on their own. Many have done so and have started fresh accounts, or started clean on their existing account by deleting all of their old posts. This alone distorted their dataset immensely from the start, not to mention other flaws that may have occurred during their scraping.
Little to No Information About Twitter Data
We compared properties of the Gab dataset to Twitter, perhaps the social media platform closest to Gab in design and function. We sampled Twitter data from the same time span as the Gab data. For efficiency, we sampled tweets from the first day of each month within this time period from the one percent Twitter API, giving us 86,568,694 tweets from 25,837,947 accounts.
To obtain up to date estimates of retweets for these messages we redownloaded 6,322,088 tweets from this sample from the Twitter API. We measured retweets using the retweet_status.retweet metadata field. To compare Gab’s user network with Twitter, we obtained follower counts of each unique Twitter user in our dataset, using the user.followers metadata field. Since our sample includes only users who tweeted at least once, our Twitter data collection shares this bias with Gab.
We are told nothing about the Twitter data. Where were the users in this data set located? How old were the accounts? Twitter has users all over the world with 80% of their users being outside of the US, while Gab’s users tend to be more in the US, UK, and Canada. Earlier in the paper they claim that they gathered 15 million posts from Gab and an “equivalent sample from Twitter.” 15 million posts from 146,000 Gab users and 86 million tweets from nearly 26 million Twitter users is nowhere near “equivalent.” Right from the start, this data is flawed and the data pulled from Twitter is purposely ambiguous.
Their first claim is that Gab users have less conversation than Twitter users. They draw this conclusion based on the total number of @mentions across both data sets. What they fail to inform the reader is the fact that on Gab’s legacy system, which is where this data comes from, users were not @ mentioned when they were being replied to.
Instead, a reply worked similar to a reply on Facebook. You would simply reply to the original poster and they would receive the reply in their notifications without their username being mentioned in the post. Comparing the @ mentions on Gab and Twitter using this data set is like comparing apples and oranges.
Flawed data, flawed methodology, flawed conclusion.
Their next brilliant conclusion is that Gab users talk about politics. What a shocking revelation. Millions of people are being banned, shadowbanned, demonetized, and thrown in timeout for talking about politics on other platforms. It’s only natural that they come to Gab, the home of free speech, in order to do so. No one is being banned on Twitter for talking about the Kardashians or sportsball.
A Smear of Alt Media
Gab users’ shares are more politically homogeneous & contain more state-sponsored content
We extracted the link appearing in each Gab and Twitter message in our collection, and extracted the domain name, e.g., from www.yahoo.com/news/MAGA-news-story we extracted yahoo.com. We then characterized each of these domains as to whether or not they were a news site, and the nature of the site based on information about the site contained in Wikipedia.
Of the 200 most commonly shared domains on Gab, 74 percent are news Web sites. Most of these were far-right news websites according to Wikipedia. Web sites associated with conspiracy theories and the state-sponsored messaging (e.g., Russian government sites) were also observed. In comparison, the types of links shared on Twitter are more diverse. Table 3 shows part of the top 50 domains shared on Gab and their percentages of all URLs. We highlight those domains that are common to Gab but relatively uncommon on Twitter. These include links to other Gab posts (gab.ai) and many right-wing news Web sites.
Academic “scholars” citing Wikipedia to determine whether or not a website is a news source or not, what could possibly go wrong? Breaking news: Gab users share Youtube links and alternative media links. Somehow this means that “state-sponsored” content is being shared. This is a clear attempt to smear alternative media websites and push the Russian hysteria hoax that has been circling in mainstream media circles for years now. Unless YouTube is now “state-sponsored media,” this conclusion is a joke.
Falsely Claiming That German Gab Users Are Bots
These “scholars” go on to claim that the Germans on Gab must be bots because they all speak the same language and follow one another. No seriously, that’s their conclusion.
This suggests that German-speaking accounts are structured as a cohesive unit, largely following one another. This may be driven by a small sub-group seeking to create tighter in-group connections or may suggest the presence of a coordinated network of (possibly automated) accounts.
These ivory tower clowns made up their own conclusions and invented “facts” to fit those conclusions based on flawed data and specious methodology. This study was about smearing Gab, smearing alt media, and tying both Gab and alt media to “muh Russians” based on false premises, flawed data sets, and hasty conclusions. The paper should be retracted and the FirstMonday should be ashamed for publishing it.
That’s not scholarship. It’s a smear.