research question

When the research query is refined what kind of contents are preferred from the user?

question4svg 0.00 0.25 0.50 0.75 1.00 RATIO = RANKING / AMOUNT Fabruary April June August October Dicember Fabruary April June August October Dicember Books News Videos DEVICES 2016 2017 2018 USA GOOGLE NEWS, YOUTUBE, GOODREADS Contents about Defectors, first 100 results per each devices 2016 2017 2018 Books News Videos DEVICES BAIDU NEWS, BAIDU VIDEO, BILIBILI, BOOKS Contents about Defectors, first 100 results per each devices CHN 0.00 0.25 0.50 0.75 1.00 November January March May July September November January March May July September RATIO = RANKING / AMOUNT

Description

To answer this question we made an analysis of the devices previously used in the second question: we simply changed our query from “North Korea” to “North Korean”. As the scatterplots show there are a few amount of results from China, where the topic is treated less. However the dots have a better distribution over time: Chinese media treated the topic longer using books as their favorite type of artifact. On the other hand Western coverage for news appears to be strongly biased from the Google News algorithm, otherwise from a general perspective the topic has become salient in the last few months, starting from April. The highest rated contents are videos. Our first insight was that, regarding this topic, people prefer highly specific and detailed contents such as books and videos, over news, a more generic and “cold” informations provider.

The superficial quick view we had about the whole situation made us argue about the real position the Defectors’ Chronicles occupy. Even if it has not a strong voice in the debate we noticed it expresses another interesting side about NK apart from the Nuclear discussion. We redefined our query using “North Korean defectors” and we again focused on the press, videos, and books to see how the content is clustered considering each different device. What emerges from this analysis is the strong interest of users in this kind of topic unless the unremarkable attention media express about it.

Protocol

protocol fourth quest Rstudio SCATTERPLOT Illustrator(refining) VISUALIZATION If the research query is redefined what doesemerge about thedefectors debate? QUERY NORTH KOREARESEARCHQUERY CORPUS DEFINITION U.S.NORTH KOREANDEFECTORS 脱北 CHINA Youtube Google News Excel(manual labeling+ratio calculation) Goodreads Books Baidu News 100results 100results 100results 100results 100results 100results Baidu Videos+ Bilibili - videos ranking/ videos amount

We used the same devices from the second question: Google News, Youtube and Goodreads for Western culture and Baidu News, Bilibili and Books for China. We changed our query to “North Korean defectors” and we scraped the first page results for every device. We collected more than 400 results per device. We listed, labelled and cleaned our results, removing double and non-pertinent contents. We merged our single datasets to one unique excel file, in order to compare different devices. To compare the ranking value for every device (as they are all different) we divided ranking per number of results, in order to have a proportion between devices.

Data

Timestamp: 11/2016 - 11/2017

Data source: Google News, Baidu News Youtube, Baidu Video,
Goodreads, Books

We started with one dataset per device, every dataset had the same parameters: title, link, ranking score, date and type of content. The second step was to merge all the dataset together to create a single dataset containing all the informations. The ranking score parameter was modified to make it proportional.