research question

2.1—How much did the definition of Hate Speech on Wikipedia change over time?

“Hate Speech” page creation across all languages and relative number of edits per months Page length in characters 0 15.000 30.000 45.000 60.000 Jan 2001 Jan 2002 Jan 2003 Jan 2004 Jan 2005 Jan 2006 Jan 2007 Jan 2008 Jan 2009 Jan 2010 Jan 2011 Jan 2012 Jan 2013 Jan 2014 Jan 2015 Jan 2016 Jan 2017 Jan 2018 Nov 2018 Legend No info available No edits Peak of edits Page lenght Wikipedia creation Law emanated Laws against hate speech 31.842 38.128 5.259 19.786 2.231 12.503 4.655 8.596 3.615 3.373 12.196 14.848 5.071 16.829 1.403 8.039 6.794 1.820 3.087 923 3.691 52.260 384 716 2.380 5.482 686 3.734 396 2.886 4.423 583 1.516 1.470 3.456 726 1.826 331 May 25, 2016 Japan Hate Speech Act May 30, 2016 Council of Europe Code of Conduct on Countering Illegal Hate Speech Online June 30, 2017 Germany NetzDG law Basque 1 Burmese 7 Serbo-Croatian 7 Slovene 8 Icelandic 9 Swedish 10 Latin 11 Greek 15 Slovakian 15 Turkish 15 Croatian 16 Simple English 17 Thai 17 Persian 18 Urdu 20 Afrikaans 22 Macedonian 24 Estonian 32 Norvegian 36 Ukranian 37 Chinese 38 Hungarian 40 Indonesian 45 Korean 48 Portuguese 58 Italian 59 Arab 61 Serbian 75 Russian 75 Finnish 90 Spanish 98 Dutch 118 Hebrew 132 German 169 French 201 Polish 267 Japanese 1.176 English 2.275
Description Protocol Data


As we can understand from the visualization, the most edited pages are the English and the Japanese ones; these two pages also stands out in length compared to the others, but surprisingly are not the longest ones, since the Macedonian Hate Speech page counts more than 50.000 characters even if it doesn’t have any interesting peaks of edits through time. The English page had an almost constant activity but it had an interesting overall increase of edits in 2018, making us understand that the controversy is alive. The Japanese page instead had an interesting high activity from 2013 to 2015, that suggests a considerable public debate on the discussion. Moreover, we noticed that after the “Hate Speech Act” passed in Japan, the amount of edits to the page has been constant and not remarkable.

The first part of the visualization shows the number of edits per month of all the “Hate Speech” pages across all languages on Wikipedia, ever since they were created on the platform. The existing 38 pages are organized in descending order, so that the most edited one is at the top and the less edited one is at the bottom. The page does not exist for other languages that are not mentioned here in the visualization. The blue bars represent the total amount of edits starting from the date they were opened. The height of the bars corresponds to the number of edits during a specific month, and the minimum height represents the page existence (no edits). The total amount of edits per page at November 15, 2018 is mentioned at the end. The mouseover helps reading the visualization, by highlighting the edits per page and the corresponding name, with the possibility to click on it and view that specific Wikipedia page.

Moreover, in the timeline we displayed when the Code of Conduct was signed (in this case the data was collected during the First Phase) and when the Laws against Hate Speech passed, in order to see if they matched with the peaks of edits (these data were collected by reading the pages of Wikipedia about Hate Speech). More details of these laws can be found on the right side of the visualization in three orange rectangles. The second part of the visualization specify the page length with the orange squares that indicate the precise number of characters.


At first, we wanted to understand in how many languages of Wikipedia there was a page dedicated to Hate Speech. In order to do so, we queried “Hate Speech” on Wikipedia and we went to the “Languages” section. We found out that there are 38 existing pages so, for every page, we went to the “View History - Page statistic” section (Wikipedia-XTools) and we scraped manually all the information about how many edits were made month by month. We choose not to use the year counts, since we wanted to go deeper and to be more precise. The data about the length in characters of the pages have been scraped from the same page in the “General statistics - Prose” section. All the data were collected and organized in an Excel file, which was later upload on RawGraphs in order to start displaying the data on bar chart mode; the resulting SVG file was later refined with Adobe Illustrator.


Data Source: Wikipedia, Wikipedia-XTools
Timestamp: 11/15/2018
View Data (104 KB)