While scraping the news we noticed common patterns that could tell us something about how defectors issue is treated from media. It seemed that, particularly vertical articles, the one talking about their journey and their life (listed as LIFE, HUMAN RIGHTS and HEALTH), searching for the detail, even the most morbid one, was something recurring. Words such as “hell”, “death”, “food” (which indicates generally food shortages or famine related problems) are used in multiple articles to talk about defectors situation. One interesting word is “parasite” (or “parasites”) that is linked to one single event, reported from different devices: one military, that was shot during his escape was found with a 27cm parasite in his stomach.
As a conclusion we can say that if we remove all the news about defectors that concern politics, nuclear and war we will have a selection of articles that talk about their life in a really personal way, with a large space to details.
We selected a sample of 30 articles from the english dataset, choosing articles from the most representative categories. We did the same with the Chinese dataset, retrieving 30 articles.
We manually organized them inside a new Excel files, using Voyant Tool to analyze single articles and get the recurring words per categories. Then we processed the entire articles corpus to see the total amount of words used.
Timestamp: 11/2016 - 11/2017
Data source: Google News, Baidu News
Download data (4MB)
We used four dataset: two for categories and two for the total corpus. Inside the first two datasets we have informations about relevant words splitter per single categories. Inside the last two we have the total amount of relevant words according to the sum of every article we analyzed.