research question

Goodreads and Books: what are therecurring topics?

b14svg CONFLICT DEFECTORS ECONOMY HABITS NOTRELATED NUCLEARISSUE OTHERS POLITICS SOCIALISSUE NUMBER OF BOOKS TOPIC USA CHN NOTRELATED HABITS NUCLEARISSUE CONFLICT ECONOMY POLITICS OTHERS SOCIALISSUE DEFECTORS NUMBER OF BOOKS TOPIC GOODREADS Number of books divided per topic BOOKS Number of books divided per topic

Description

We started searching for a large amount of artifacts: movies, songs and books. In a second moment we decided to concentrate on books, as other results were not relevant for our research. Reading a book require time and interest in the topic treated, giving the reader detailed information about one defined theme. We opted for a generic query in order to get more results from the last year.

Defectors and civilian social issues are the most popular topics for english written books, readers seem to enjoy stories about defectors personal life, life condition in North Korea and in prison camps. Those two kind of books can be really specific and detailed, narrating escapes, tortures and other kind of issues.

When we observe the Chinese situation we can see how the situation change: a big part of the books talk about society and culture - they can be recipe books or history books - defectors stories are absolutely marginal, but still we see how civilian social issues detain an important role inside the cultural production.

Protocol

prot12 BAR CHART Illustrator(refining) BAR CHART Illustrator(refining) CORPUS DEFINITION Goodreads: what are thereccuring topics? “NORTH KOREA” RESEARCH QUERY QUERY NORTH KOREA USA goodreads.com CHINA Books WEBSCRAPER WEBSCRAPER Excel Manual Cleaning EnglishTranslation ManualTopicLabeling DATASET 1 Excel Manual Cleaning ManualTopicLabeling DATASET 2 VISUALIZATION TABLEAU TABLEAU 朝鲜 - ranked by: rankings on Goodreads and on Taiwan (books) - considering the first 100 results

We started our research with a narrowed query “North Korean defectors”, but as we did not retrieved enough results we returned tour first broad query “North Korea”. We discarded time, giving priority to the ranking score. We scraped the results creating a dataset containing ranking, title, link, author, average rating (decided by users), numbers of reviews, year of publication, edition and topic. We considered the first 100 results, manually retrieving the abstract of every book. We cleaned the dataset from non relevant results.

Data

Timestamp: 11/2016 - 11/2017

Data source: Goodreads, Books

Two datasets, one for China and one for english written publications.