THE ONION CORE (GITWEB’S TOR ANALYSIS)

Protocol

Introduction

The aim of this protocol is to understand where and when the Tor browser was created. We have analyzed the Tor repository on Gitweb (site where the creators of Tor are publically working). Tor browser's creators are continuously updating the software, making it faster, easier to use and more stable, but it is possible to use the code produced since now to understand who is behind this project.

First step

We have used an internet crawler to download from https://gitweb.torproject.org/ every commit has been done in the Tor repository (the main repository, where the browser code is built): who made the commit, when and how many files he or she has added, deleted or modified. We have made clusters of same day commits (at first they were divided by minute) After this, we have analyzed the data using Microsoft Excel searching the most productive coder (an Index of Activity has been created by the sum of files modified, added or deleted in a single day). Then, having seen the first 3 people had made a lot more code than every other one, we have visualized their Index of Activity in a timeline, looking for possible spikes of productivity. After this we have seen the comments on the spike of productivity day looking for the cause of the spike of productivity.

Second step

The majority of people who made commits on the repository had a nickname or a personal name so it was easy to search, using Google search engine, from where they were coding. Using Microsoft Excel we have made clusters of same nation activities and we have visualized them using Cartodb. In the end we have crossed the data found with the quantity of Tor browser exit node in every nation in a determined day (we have seen, comparing different dataset taken in different times and days, that in average there was no difference in the location of exit node) using Cartodb.

Metadata

Timestamp:
28/11/2014 - 02/12/2014

Data source:
Gitweb.torproject, Google, Dan.me.uk

Tools:
Kimono, Microsoft Excel, Cartodb