See Also Network

Protocol

Introduction

News about Mass Surveillance disclosures make the debate growing and developing, and it actually was the aim of journalists and Edward Snowden. Aim of this chapter is to identify the issue of the research and to delineate the survey area, starting from Wikipedia, that contains both common people and experts points of view and information are often revised and up-dated. This first step introduces the topic and allows the understanding of the boundary lines of Mass Surveillance controversy. Who are the main characters of the debate? Which topics are involved? Which key-words caracterizes the issue? Those are the topics this chapter wants to highlight.

Steps of the procotol

First step of the dataset creation was the selection of five Wikipedia pages about the Mass Surveillance controversy, the issue of the whole analysis. After a general research on Wikipedia, necessary to understand who and what is involved and to have a basic knowledge about the issue, the pages selection has been done considering the most different aspects of the argument: first of all the controversy existence, defined in NSA warrantless surveillance (2001-07) page with this sentence: «The NSA warrantless surveillance controversy ("warrantless wiretapping") concerns surveillance of persons within the United States during the collection of allegedly foreign intelligence by the U.S. National Security Agency (NSA) as part of the touted war on terror». Than was necessary to insert a general page about the topic, like Mass Surveillance, that examines it from different countrys view and situations. Another important side of this theme is the world of whistleblowers disclosures, of which Edward Snowden is the most famous and present character; indeed, today like in past, whistleblowers have revealed government and intelligence secret spying programs to general public. On the other hand, NSA and governments classify their actions as a “war against terrorism”, to prevent attacks and increasing citizens security; Terrorist surveillance program summarize the beginning of modern mass surveillance, started after the 9-11 attacks, a fundamental date for the birth of this debate. After identifing the five pages it was necessary to extract “See Also” links from each page, to start building the network of relationships. For extract the links from the html pages it has been used an html link extractor (Extract Link Url); but for show the argument context and return a significant overview, this network has to be the largest possible: the software that allows to extract other levels of “See Also” (or rather the “See Also” of wikipedia page “See Also” and so on…) is Seealsology. The choice has been to set the “Distance” parameter at three, that means to extract three levels of “See Also”, starting by a source including the five selected pages and the first level of “See Also”. Once obtained the csv or gexf file is ready to be imported in Gephi for the network visualization. The network spatialization has been done using Force Atlas 2 algorithm; the InDegree value was used for rank the nodes size, while for clustering them has been used the Modularity class, a parameter that «measures how well a network decomposes into modular communities» (wiki.gephi.org). The network visualization allows to discern different sub-networks (clusters) and define the operating area in this wide and heterogeneous context.

Metadata

Timestamp:
20/11/2014 - 04/12/2014

Data source:
Wikipedia, Google

Tools:
Seealsology, Extract Link Url, Gephi