When the research query is refined what kind of contents are preferred from the user?

Description

To answer this question we made an analysis of the devices previously used in the second question: we simply changed our query from “North Korea” to “North Korean”. As the scatterplots show there are a few amount of results from China, where the topic is treated less. However the dots have a better distribution over time: Chinese media treated the topic longer using books as their favorite type of artifact. On the other hand Western coverage for news appears to be strongly biased from the Google News algorithm, otherwise from a general perspective the topic has become salient in the last few months, starting from April. The highest rated contents are videos. Our first insight was that, regarding this topic, people prefer highly specific and detailed contents such as books and videos, over news, a more generic and “cold” informations provider.

The superficial quick view we had about the whole situation made us argue about the real position the Defectors’ Chronicles occupy. Even if it has not a strong voice in the debate we noticed it expresses another interesting side about NK apart from the Nuclear discussion. We redefined our query using “North Korean defectors” and we again focused on the press, videos, and books to see how the content is clustered considering each different device. What emerges from this analysis is the strong interest of users in this kind of topic unless the unremarkable attention media express about it.

Protocol

We used the same devices from the second question: Google News, Youtube and Goodreads for Western culture and Baidu News, Bilibili and Books for China. We changed our query to “North Korean defectors” and we scraped the first page results for every device. We collected more than 400 results per device. We listed, labelled and cleaned our results, removing double and non-pertinent contents. We merged our single datasets to one unique excel file, in order to compare different devices. To compare the ranking value for every device (as they are all different) we divided ranking per number of results, in order to have a proportion between devices.

Data

Timestamp: 11/2016 - 11/2017

Data source: Google News, Baidu News Youtube, Baidu Video,
Goodreads, Books

Download data (4MB)

We started with one dataset per device, every dataset had the same parameters: title, link, ranking score, date and type of content. The second step was to merge all the dataset together to create a single dataset containing all the informations. The ranking score parameter was modified to make it proportional.

prev

next

research question

When the research query is refined what kind of contents are preferred from the user?

Description

Protocol

Data

Timestamp: 11/2016 - 11/2017

Data source: Google News, Baidu News Youtube, Baidu Video,
Goodreads, Books

Download data (4MB)

Description

Protocol

Data

Timestamp: 11/2016 - 11/2017

Data source: Google News, Baidu News Youtube, Baidu Video,Goodreads, Books

Download data (4MB)

Data source: Google News, Baidu News Youtube, Baidu Video,
Goodreads, Books