GEONAMES FREQUENCE MAP

Hot spots

Introduction

The map shows the result of geonames analysis based on the texts extract from the first 200 Google results using the queries “file sharing” + effects, “file sharing” + consequences, piracy effects and piracy consequences. In particular it’s shown the frequency of each geoname, found out by counting how many web pages mention every single name. Those geonames have been also divided in 3 categories: universities, cities and countries.

How to read the visualization

As mentioned, markers on the map are divided in 3 categories (countries, cities, universities) and are recognizable by a chromatic code. The dimension of the markers is in relation with the amount of web pages in which it’s mentioned that specific geo name.

By using buttons it’s possible to filter the visualization by geoname’s category. It’s also possible to visualize each geo entity’s name by clicking on the marker.

How it has been done

From the first 200 Google results was extracted all the entities; after that it has done a second phase of cleaning to obtain a list of only geonames. Subsequently, this list was further refined: if in a single web page a geoname appeared repeatedly, all the duplications were eliminated. In this way there were a maximum of only one mention per page for every geoname. This has been done to avoid that very long articles in which the same geoname is repeated many times could affect the final result. The csv file was imported in CartoDB; in map view, the visualization has been set with two different filters: by dimension and by color code. The first one related to “times” column of the dataset (frequency of the geoname), the second one related to “category” column (type of geoname). An infowindow has been added to show the geoname by clicking on a marker. By clicking on the buttons it’s also possible to filter the visualization by categories.

Findings

The geo entities that are mentioned in the web pages are spread all over the world and not just in Europe and USA as you might think. In particular, we can point out the presence of many universities that are those who care more about the subject because of safety reasons and they are also those who are studying it . Most of the articles about this topic are written by academics of american and english universities indeed.

Another thing to note is the presence of countries like Sweden, that is very relevant because of Pirate Bay and the whole question about it, or the area of California where there are the locations of the majors.

Looking at the amount of geonames for each category it’s evident that university entities are significantly more than the geonames of the other 2 categories. However considering the amount of mentions the gap between university geonames and country geonames is greatly reduced.
This reflects the fact that mentioned universities are numerous but each one of them is nominated only a few times; mentioned countries are less than universities but they are nominated many times. This happens because universities are small entities scattered around the world and their voice can be hardly heard in the debate; therefore the authors of the articles prefer to talk referring to big entities like countries which are certainly more obvious.

Metadata

Timestamp: 24/11/14 - 12/12/14

Data source: Google

Related Protocol

Download data (4MB)