After the previous videos scraping we noticed that the video-news is often spread using a screenshot of the original video. This visualization allows us to connect the images to the integral videos. In this way we observed which parts of the videos were used as widespread images and we organized them by Google rank.
Each video was converted into images using one frame per second, so we could find the better range of time to connect with the Google images. Afterwards, all the videos were compressed in a strip to provide an overview about them and their duration.
Considering the sensitivity of contents we decided to give the possibility to the users to choose how to explore these contents, assigning some labels:
- ‘no explicit violence’: these contents aren’t clearly graphic but the language or the subjects’ conditions bring back to violence;
- ‘danger’: these contents are considered dangerous as clearly violent, offensive and they correspond to the minutes before the execution;
- ‘sensitive contents’: the images show graphic content as explicit violence against someone and blood.
In most cases we found out that the main images used to communicate the video-news in the web corresponded to the last couple of labels and that they are often placed in the last minutes of the videos.
Protocol
1. Queries definition
In Google Image Advanced Search we found the right queries to search the images on Google.
We filtered the results for United States, in order to get only the pictures spread in that country.
We wanted to analyze which were the video-frames used to report the news from different websites, and the same time where the attention of people was focused on.
2. Corpues definition
The corpus was made of three parts: an image database, a video database and an url database.
In particular we dowloaded 1400 pictures from Google using Downthemall, scraped manually to 371 results, and organized by similarities in folders.
Then we used Google Image Search Url Extractor, and after the url download, we created an excel file writing the correspondence between url and pictures’ name, adding the google ranking as well.
In the end, we extracted the frames from the video using After Effects, importing the video and downloading 1fps. The frames were then associated to the pictures.
The last step consisted on the conversion of the videos to strip-line, using Movie Barcode Generator, in order to star the visualization process.
The data consist of two folders and an .xls file.
The excel file contains 7 spreadsheets for each group of image referred to each video, they are organized in 5 columns: growing rank, image domain, image url, source domain and source url.
The images folder is organized in 7 sub-folders for each event, containing pictures renamed with the Google rank number first.
The videos folders contain the integral videos for each execution. Warning, these folders contain graphic images and videos that may upset some viewers. If you are sure to download these contents you need to use this password: gruattro.