Newspapers' API Analisys

Spotting the Pirates

Introduction

The graph shows a list with the 10 most discussed trends per year, from 2002 to 2014. These trends refer to th New York Times and The Guardian articles, and the trends are calculated by the amount of articles in which the query "File Sharing" is occurred.

How to read the visualization

The dataset is indexed by year, showing the 10 most quoted trends, hierarchically displayed. Is it possible to check the trend's position and the year simply by verify the correspondent row and column. Moreover, hovering on the single value the graph highlights similar trends, showing how a word is recurred during these years.

How it has been done

First of all, it has been defined the research query: since including "internet piracy" was misleading due to the relation to the marittime piracy, it has been decided to exclude it, using only the query "file sharing". It has been collected 500 articles per newspaper, and manually checked the pertinence of the article.

The choice to research into the articles archive of New York Times and The Guardian was defined in order to understand the differences between two newspapers who have different cultures and political positions. However, the result of the research showed that those differences are very tight, so it has been decided to merge the corpus in order to display a graph that could represents sort of a global vision of the trends per year.

Example of the differences between The Guardian and New York Times about the trend "Napster"

Findings

The visualization show a result that is interesting to analyse. First of all, the first position is always occupied by an illegal platform related to the copyright infringement. This happens because the opinions are led by the events, which define, open and close periodically the discussion. So it's easy to understand why, for instance, is it possible to find "The Pirate Bay" in first position only in 2009 (year of the first process), even if the platform is active since 2003.

Furthermore, is it possibile to identify the major and the artists right below the illegal file-sharing platform, displaying the tight relation between the defenders and violators of the copyright (i.e. RIAA-Kazaa).



Metadata

Timestamp:
16/12/2014

Data source:
New York Times, The Guardian

Related Protocol

Download data (4MB)