In The New York Times site, in the Health section, the articles obtained by the research milk are sorted by relevance and are filtered by date, after the 01/01/2005. The first 300 outcomes are downloaded with Kimono, manually taking out those concerning Human Milk and Recipes, in order to reach a list of 100 pertinent results. The texts are downloaded with Blockspring (Extract Text from URL) for all the articles.
In the Health section of PubMed, the papers obtained by the research milk are sorted by relevance, manually excluding those of Human Milk. The title texts and the abstracts of the first 100 pertinent results are downloaded and ISS code, copyright, authors’ information, French and Spanish translations, link and words as “abstract” and “keywords” are deleted.
Keyword Density Analyzer is used to search the most frequent single words and couple of words, both in the texts of The New York Times and of PubMed. The final four lists of 50 words are assembled manually, omitting articles, conjunctions, verbs, some adjectives, without relevance if they’re single, plurals of words already in singular, declensions of the same term, and “milk”, because it’s the requirement. Some medical contractions are combined with their complete expressions.
The words from the 4 lists are categorized, based on their definition in One Look Dictionary Search; the groups are the same for the single words and the couple of words.