World wide wiki



The second analysis unfolds other aspects related to our Wikipedia environment, indeed it tries to locate our topic, the right to be forgotten, by mapping the Wikipedia pages containing a link to the Right to be forgotten page.

We operated a quantity-quality investigation on these pages, combining the quantity of bytes of each page with the content of the texts. Our aim was to verificate if there were common topics, and which page was the best updated.

This relation becomes very interesting once we added informations about the date and authorship of the link, giving us an overview of the closest topics and how the discussion was carried on before the creation of its own page.

Step of the procotol - create a dataset

This research is divided in three parts. Each phase has a main dataset, it has been done extracting data from wikipedia.

We opened a browser searched for the page of wikipedia "Right to be forgotten". From that page we connected to the pages about right to be forgotten in the languages: Italian, French, Spanish, German, (these languages has been chosen because them are the most relevant for which events occurred in their countries).
Then, we clicked on view history in each page and using the software kimono, we extracted these elements: Date, User, Bytes of the page in that moment.

After that, we transported the data set into an excel page, and generating a formula we calculate the difference of bytes for each edit.

As result, we had a dataset that can generate the next three graphs.

Step of the procotol - Wiki history events

The first step was to generate the graph that we called "Wiki history events" posted on the page Wiki evolution.

We went on the pages of wikipedia about "Right to be forgotten" for our five languages.
Then, we used the page "View History" to see the evolution of the page for each language. During the analysis, we checked the differences between the different versions, and we took note of all the most important events.
As conclusion, we looked for connections and similarities between the most relevant cases.
In the end, we integrate on illustrator (single point for national facts, and double point for common international events).

Step of the procotol - Wiki history flow

Using the same dataset, we create the second graph called "Wiki history flow" posted on the same page Wiki evolution.

We eliminated the row when the edit was less than 50 bytes.
After that, we generated a pivot tab on Excel. In x axis was data, and y axis was users.
Than, we created a stacked area chart.
In the end, we integrated the result in illustrator, using a different color for each language.

Step of the procotol - Wiki editors

For each page we opened the History View and started the tool “Revision history statistics”, from which we extracted a list of the main authors, only those who contributed with more than 50 bytes of text, and wrote them down in a datasheet.

Back to the five different pages in different languages, for each of them we opened the page from the left menu “What links here”, obtaining a list of Wikipedia pages containing a link to the main topic page. From this list we discarded the pages which are not real articles (User pages, discussion, etc), then for each one of them we opened the History View and from there started the tool “Wikiblame” searching the article revision in which first appeared the terms “right to be forgotten” and wrote down in a separate dataset the date, the author of the edit and the presence of a link, since it might be present only a discussion of the topic, without link.
If so, we ran again the Wikiblame tool, this time giving as imput: “[[right to be forgotten]]

By matching the two datasets we created a visualization showing in which date, inside a page linking to the Right to be forgotten page, started to appear the term “right to be forgotten” and whether this was a link to the main page or if it was part of a paragraph.


26/11/2014 - 3/12/2014

Data source:

Excel, Kimono, Revision history statistics, Wikiblame