Description
Here we decided to focus on the most commented videos in order to uncover all the different users’ inclinations and their reactions to the video contents.
For the Chinese part we just considered all the results we collected (relating to each particular category) since the general amount was not that huge. On the other hand, relating to Youtube videos we had to someway reduce the amount of comments to better focus on them. This meaning that we considered the first five most commented videos and scraped them analyzing only the comments with more than a hundred likes. This for a specific reason: choosing the most shared opinions.
Protocol
We used the DMI tools to scrape all the Youtube comments while for Baidu Video and Bilibili we used Webscraper when possible. After translating all the Chinese results we separately analized the tone to better understand how the two different points of view.
For evaluating the comments tone we employed Tone Analyzer tool which detects joy, fear, sadness, anger, analytical, confident and tentative tones after driving a linguistic inspection through the text.
Data
Timestamp: 11/2016 - 11/2017
Data source: Youtube, Baidu Video,Bilibili
Download data (125KB)
We organized the excel file filtering the most commented videos and among them the most liked comments. Since the considered comments related to the categories in different amounts we proportioned them in order to make them comparable. To do so we calculated the ratio dividing the number of comments by the number of video in each category. As a result we decided to tone-inspect all the videos with the highest ratio-volume. After converting the .json outputs of the tone analysis we manually cleaned them and built another dataset.