Chapter 2 - Is drinking milk healthy?

Inside the queries


Once we defined that our controversy about milk is regarding if it’s healthy or not drinking milk, we consider all the aspects of it that we’ve found in the first research on google. We translate the terms in queries, in order to examine separately the reasons of the 3 main positions.


The motivations related to Health, that have come up from the first chapter, are Nutrients, Diseases, Intolerance, Hormones, Chemical Agents, Fitness. They have been the base for the determination of 5 corresponding queries:

  • Nutrients: “milk” “nutrients” effects
  • Diseases: “milk” consumption effects
  • Intolerance: "milk" intolerance natural humans
  • External agents: hormones antibiotics pesticides "milk" "quality"
  • Body good: "milk" "body good"

The 5 queries are typed in the incognito mode of Google Chrome. In the settings of the instant has been removed and the results have been increased to 100. The first 100 links have been exported with Moz, an extension of Google Chrome.

First they are validated with Harvester, deleting the links from:
wikipedia, youtube, vimeo, dailymotion, facebook, twitter, translate, google, baby, babies, child, children, tripadvisor, kid, kid, mom

Then the articles related to Human Milk are deleted, because belonging to a different controversy, as well as the ones related to Milk Alternatives, that are outside the focus of Health. The further comparison in Text Compare between the outputs of Harverster and Moz, allows to keep the original titles of the links obtained.

So it’s created a dataset with the job of the authors, their position regarding the controversy, the emerging topic of the websites and the typology of them, leveled up with Open Refine.

How to read it

Al the graphs are organized in the same way: on the left side you can see the positions, made clearer by the colors, and on the right side the authors of each article. The authors identified are:

  • Academics who are the ones, working on studies about milk and writing official papers.
  • Experts meant as people who are very knowledgeable about or skilful in this particular area (ex. Nutritionists, Dietitian…).
  • MDs (Medicinae Doctor) or doctor of medicine are people with an official medical degree.
  • Journalists people who are determined as such because of their studies or past experience.
  • Writers people who describe themselves as writer.
  • Undefined includes all the articles where the author is not specified.
  • Organizations is meant when the article is not signed but the owner of the website is a specific organization.
  • Producers in this category of writers are included the producers of dairy products and milk.

Each visualization is concerning one query, they are all visible in miniature and clicking on one of them you can see it in bigger size, and you can have at the same time an overview on the sides of the others.

The table resume the arguments of the positions in details.


Looking at the five graphs, focusing on the first one about the query of nutrients, you can see that is the only topic in which the pro position surpasses the others and it’s visible that only experts and MDs take all the sides of the controversy. The other authors are in prevalence on the green side.

The diseases graph shows how both, the positons and the authors, are more or less balanced; all the five positions are present and at least the three main sides are debated by each authors.

The visualization concerning the intolerance is unusual because the pro and cons with criticism are no present at all and the actors, excluding MDs and organizations, discuss all the positions.

Looking at the External Agents’ graph, it’s clear how much the cons part is strong in this issue and also in the whole controversy.

In the last graph the cons part is strong again and, looking at the authors, only few authors have a relevant voice, the others are not really taking part to this discussion.

Watching at the graph below the red component is more present than the others in almost all the queries.


Timestamp: 21/11/2015

Data source: Google

Download data (30KB)