The encyclopedic point of view

Wikipedia network: anonymity and its neighbourhood


The first exploration around the online anonymity theme has been made to observe how it is presented from the encyclopedic point of view through Wikipedia. This first part of the research was helpful to answer some generic questions about the theme.

How online anonymity is presented as a controversy? How it is linked to other controversial web thematics? Is it a clear and well defined argument or is it strictly connected and mixed to other themes? How anonymity is linked to the freedom of expression? And is the last one well positioned in the internet controversies world?

How to read the visualization

The visualization is presented as a huge network containing a lot of bubbles and connections. Each bubble represent an article of the corpus. The dimension of each bubble is determined by the amount of connections with other bubbles. The biggest bubbles are the most important one into the net. Clusters are distinguished by colors. A cluster is a group of similar articles. Each cluster has a color and a particular position into the net. The more a cluster is isolated and positioned far from the center, the more it is defined and separated from the rest of the arguments. On the other hand the clusters that are really closed to each other are difficult to divide because of the interconnection among the articles.

The net is interactive so the user have the possibility to easily navigate it. It is possible to distinguish in the net four concentric levels. The first level is the central one, a macro-cluster containing different arguments. The second one is positioned immediately close to the center presenting specific articles (as the anonymous group and cyber-crime) but not strictly inherent wiyh the anonymity theme. The third and the fourth ones are constituted by a lot of smaller articles far from the center containing really specific and delimited clusters not linked directly to the principal theme.

The second visualization is a focus of the same previous network and it is showed just to immediately visualize which are the interested clusters linked to the general theme of online anonymity. From this graph it is possible to understand which are the articles, and so the clusters, analyzed for in this research: anonymity and freedom of expression.

How it has been done

To create the network graph it has been started from five seeds (Wikipedia pages) selected as the most relevant to study the anonymity theme. Than had been selected the links from the see also section of each Wikipidia seed obtaining in that way a list of urls. It has been managed through an open source software named “seealsology”. The links’ list has been enlarged with the see also until the third level.

The final graph was created through the help of Gephi software (an open source software able to elaborate complex networks). The net was subsequently modified graphically colorizing the clusters and highlighting the interesting ones. The final interactive visualization has been created through the help of a Gephi library allowing the user to navigate into the net.


The cluster regarding anonymity is the yellow colored and it is positioned on the center of the first level containing important articles (some generic as the “anonymity” and the “pseudonymization” and some technical as “anonymous p2p”). Really close to that cluster is positioned the “internet censorship”, an important theme related to the anonymity debate. The “mass surveillance” is another cluster pretty linked to the general theme and it is the second possible thematic to study the controversy. This visualization was helpful to understand the general situation around the anonymity theme and to choose a particular subject to focus the research.


Timestamp: 20/11/2014

Data source: Wikipedia, Gephi, Sigma js

Related Protocol

Download data (908KB)