mapping collaborations

  • organization
    Scholars@Duke, Duke University
  • role
    original project
  • awards
    1st place, Duke Research Computing Symposium, Data Visualization Challenge
  • press


Does proximity influence collaboration? A visualization exploring the geographic spread of academic collaborations across Duke University.

This was an original piece submitted for Scholars@Duke, and the Duke Research Computing Data Visualization Challenge.


Scholars@Duke released a dataset comprising over 3k individuals at Duke University, and over 75k academic publications. They issued an open call to take these data and derive an innovative visualization. The only criterion was that the final piece should speak to the breadth and vitality of scholarly research taking place at Duke.


I used this challenge as an opportunity to explore ideas first put forth in the 1970's by MIT professor Thomas Allen, who reported that the frequency of communication between two engineers sharply decreased as the distance between their desks grew. What types of insights could be gleaned by mapping how collaborations within an institution are geographically distributed?

The primary hurdle was in mapping an individual to a specific location on campus. The provided datasets offered names and IDs of each individual, but not much else. I began by using the Duke directory to link each individual to a specific department. Next, I obtained access to a database from the Duke Plant Accounting office that listed all buildings owned by Duke, as well as the address and latitude/longitude coordinates of each building. A second database showed which departments owned offices in each building. With these pieces of information, I was able to link each individual to a department, each department to a building, and each building to a specific location.

Next, I analyzed the publication dataset to explore where collaborations (defined as two individuals co-authored on one or more publications) were occurring. Approximately 48% of all collaborations took place between individuals associated with different buildings on campus. Since collaborations were spanning buildings all over campus, putting this data on a map was an obvious design choice.

I made a basemap on which each campus building was outlined in bold, and plotted the collaborations as a series of bezier curves connecting between buildings. I added a random offset to the height of each curve to convey the collaboration density - and minimize overlap - between buildings. For an additional layer of meaning, the collaborations were grouped into 3 distinct categories and assigned colors based on the number of times two individuals worked together.

With the main map in place, the final task was arranging the layout of text and additional visual elements. I included a chord diagram to show the number of collaboration between each building and every other buildings, as well as an additional figure representing how the probability of collaboration decreased as the travel duration between buildings increased.

The final piece was awarded first place.