Protocol
1. We chose 6 queries:
immigration, immigrant, migration, migrant, emigration, emigrant.
2. Five different platforms were selected based on two requirements: they should allow the use of keywords for the research and have a clear ranking criterion. We decided to use:
google.com, wikipedia.org, imdb.com, themoviedb.com, allmovie.com.
The queries were adapted to take advantage of the specific characteristics of the platforms. Here the processes we followed in each platforms:
Google:
a. With the query “movies” the browser returns a slider with the covers of the movies ranked by a “most frequently asked” criterion.
b. Google.com (only english pages) with incognito window navigation.
c. Six queries were matched with the word “movies”: “immigration movies”, “immigrant movies”, “emigration movies”, “emigrant movies”, “migration movies”, “migrant movies”.
d. Only two queries returned some results: immigration movies and immigrant movies.
e. Manual scraping of the results.
f. Two resulting lists: for “immigration movies” and “immigrant movies” both with 51 results.
Wikipedia:
a. In wikipedia there’s an already existing category “Films about Immigration” which lists 160 movies.
b. Use of stats.grok.se to sort them with a popularity ranking based on the page views in the last 90 days.
c. Manual compilation of a dataset collecting the titles, the views and the positioning from the most to the least viewed.
Allmovie:
a. Use of filter “Movie themes” in the section “Advanced search”. Typing the word “migr” it returns the filter “Immigrant life” with movies displayed with a popularity criterion.
b. Scraping with Kimono. 728 resulting movies
Themdb:
a. It allows to browse through the movies in the section “Discover”. We used the filters “Year: none” “Sort by: Popularity” and the keywords: “immigration”, “immigrant”, “migration”, “migrant”, “emigration”, “emigrant”.
b. The query “migrant” didn’t return any result.
c. Scraped each outcome with Kimono. The resulting lists were the following: “immigration” with 63 movies, “immigrant” with 71 movies, “emigration” with 17 movies, “emigrant” with 5 movies and “migration” with 12 movies.
Imdb:
a. Type “migr” in the search field selecting “Keywords” in the drop-down menu. Selection of all lists with these queries: “immigration”, “immigrant”, “migration”, “migrant”, “emigration”, “emigrant”.
Results sorted by popularity.
b. Scraping with Kimono with following results: “immigration” list with 867 films, “immigrant” list with 1259 films, “emigration” list with 220 films, “emigrant” list with 111 films, “migration” list with 236 films and “migrant” list with 56 films.
3. All lists were reordered in Excel with title, year and positioning of each movie.
4.A contingency table was created listing all 3140 titles in the rows and the 15 lists in the columns. The values showed at the intersection is the position of the movie in the list.
5.The occurrences were calculated using the Excel function “Count”.
6.The movies with at least three occurrences formed the basis of our corpus.
7.The plots of all movies were read to verify the connection with our topic. If they narrate a fictional migration, they take place in future or the topic was considered only marginally they were excluded.Thirteen movies were left out.
8.The remaining 120 movies were reordered according to their occurrences and the popularity based on Imdb ranking.
To set the popularity ranking between our 120 films we created an Imdb pro account. All movies were collected in a personal list, then sorted by popularity. A Kimono scraping API created a dataset with the internal popularity ranking.
9.The movies with higher occurrences at the beginning, the lower ones at the bottom. The movies with same occurrence value were ordered following imdb popularity.