Data Sets

Here you find all the data sets described and analysed in the textbook:

"Complex Networks: Principles, Methods and Applications",
V. Latora, V. Nicosia, G. Russo (Cambridge University Press, 2017)

For each data set you find below a brief description and a list of salient properties (number of node, number of edges, etc.), together with links to download it.

All data sets

All the data sets of the textbook are available for download in a single compressed file:

All the data sets in the book (zip)

The archive contains one folder for each dataset, The file README.txt in each folder contains some relevant information about the corresponding data set.
Back to top

Chapter 1

Elisa's Kindergarten network
This data set (see Box 1.2 on page 32 of the book) contains the declared friendship relationships among a group of children between 3 and 5 years old in a kindergarten.
  • Nodes (N): 16
  • Edges (K): 57
  • Directed: yes
  • Weighted: no

Elisa's kindergarten network (zip)


Back to top

Chapter 2

Movie actors network
This data set (see Box 2.2 on page 81 of the book) contains the co-starring network studied for the first time by Duncan Watts and Steven Strogratz in their seminal paper "Collective dynamics of small-world networks". Each node is an actor, and a link between an actor exists if they have acted togeter in at least one movie.
  • Nodes (N): 248243
  • Edges (K): 8302734
  • Directed: no
  • Weighted: no

Movie actors network (zip)
Back to top

Florentine families
This data set contains the network of marrital relatioships between florentine families in the 15th century. Each node is a family and an edge exists between two families if a member of one of the two families has married a member of the other one.
  • Nodes (N): 16
  • Edges (K): 20
  • Directed: no
  • Weighted: no

Florentine families network (zip)
Back to top

Primates interaction network
This data set contains recorded interactions within a group of primates, where an edge exists between two primates if they have been seen together at the same river for a sufficient number of times.
  • Nodes (N): 17
  • Edges (K): 31
  • Directed: no
  • Weighted: no

Primates network (zip)
Back to top

Terrorists interaction network
This data set contains recorded interactions between the terrorists who participated in the September 2001 terrorist attacks to the US.
  • Nodes (N): 34
  • Edges (K): 93
  • Directed: no
  • Weighted: no

Terrorists network (zip)
Back to top

Chapter 3

Coauthorship networks
These data set contain several coauthorship networks, where nodes are scientists and edges are placed between two scientists if they have co-authored a paper.
  • Data set: Arxiv physics coauthorship
  • Nodes (N): 52909
  • Edges (K): 245300
  • Directed: no
  • Weighted: no

Arxiv physics coauthorship (zip)

  • Data set: Astrophysics coauthorship
  • Nodes (N): 16706
  • Edges (K): 121251
  • Directed: no
  • Weighted: no

Astrophysics coauthorship (zip)

  • Data set: Condensed matter coauthorship
  • Nodes (N): 16726
  • Edges (K): 47954
  • Directed: no
  • Weighted: no

Condensed matter coauthorship (zip)

  • Data set: High-energy physics coauthorship
  • Nodes (N): 8361
  • Edges (K): 47594
  • Directed: no
  • Weighted: no

High-energy physics coauthorship (zip)

  • Data set: Medline coauthorship
  • Nodes (N): 1520252
  • Edges (K): 11803060
  • Directed: no
  • Weighted: no

Medline coauthorship (zip)

  • Data set: NCSTRL coauthorship
  • Nodes (N): 11994
  • Edges (K): 20395
  • Directed: no
  • Weighted: no

NCSTRL coauthorship (zip)

  • Data set: SPIRES coauthorship
  • Nodes (N): 56627
  • Edges (K): 4573868
  • Directed: no
  • Weighted: no

SPIRES coauthorship (zip)

All coauthorship data sets (zip)
Back to top

Chapter 4

C.elegans neural network
This data set contains the graph of interconnections among the neurons in the C.elegans nematode
  • Nodes (N): 279
  • Edges (K): 2287
  • Directed: no
  • Weighted: no

C.elegans neural network (zip)
Back to top

Chapter 5

World Wide Web networks
This data set contains four samples of the World Wide Web Network, respectively corresponding to the University of NotreDame (web-NotreDame.net), the University of Stanford (web-Stanford.net), the Universities of Berkley and Univesrity of Stanford (web-BertStan.net), and to a sample released by Google (web-Google.net). Each node is a webpage, and a link exists between two nodes if there is a hyperlink between the two pages

  • Data set: WWW NotreDame
  • Nodes (N): 325729
  • Edges (K): 1469680
  • Directed: yes
  • Weighted: no

  • Data set: WWW Stanford
  • Nodes (N): 281903
  • Edges (K): 2312497
  • Directed: yes
  • Weighted: no

  • Data set: WWW Berkley-Stanford
  • Nodes (N): 685230
  • Edges (K): 7600595
  • Directed: yes
  • Weighted: no

  • Data set: WWW Google
  • Nodes (N): 875713
  • Edges (K): 5105039
  • Directed: yes
  • Weighted: no

WWW samples (zip)
Back to top

Chapter 6

Scientometrics citation network
This data set contains the citations between 1656 papers published in Scientometrics. Each node is an articles, and a link from node i to node j indicates that i has cited j.
  • Nodes (N): 1656
  • Edges (K): 4123
  • Directed: yes
  • Weighted: no

Scientometrics citation network (zip)
Back to top

Chapter 7

Internet AS network
This data set contains 15 snaphots of the Internet at the level of autonomous systems (AS) between 1997 and 2001. See the file README.txt inside the zip archive for further details.
  • Nodes (N): 3015 (first snapshot) to 11174 (last snapshot)
  • Edges (K): 5156 (first snapshot) to 23409 (last snapshot)
  • Directed: no
  • Weighted: no

Internet AS network (zip)
Back to top

Chapter 8

Urban street networks
This data set contains 20 urban street networks, corresponding to 1-square-mile maps of 20 cities around the world. See the file README.txt inside the archive for further details.

Urban street networks (zip)
Back to top

E.coli transcription regulation network
This data set contains the transcription regulation network of the E.coli.
  • Nodes (N): 424
  • Edges (K): 519
  • Directed: yes
  • Weighted: no

E.coli transcription regulation network (zip)
Back to top

Chapter 9

Zachary's karate club network
This is the social network of the Zachary's karate club. A link exists between two nodes if the corresponding individuals were seen together outside the normal club activities.
  • Nodes (N): 34
  • Edges (K): 78
  • Directed: no
  • Weighted: both the weighted and unweighted networks are available

Zachary's karate club network (zip)
Back to top

Chapter 10

US air transportation network
The network of travel connections among the 500 airports in the US with the largest amount of traffic. An edge exists between two nodes if there is a direct flight between the corresponding airports. Each edge has associated an integer weight corresponding to the total number of seats available on all the direct routes between the two endpoints within a year.
  • Nodes (N): 500
  • Edges (K): 2980
  • Directed: no
  • Weighted: yes

US air transport network (zip)
Back to top

Financial stocks network
This data set contains the network obtained from the analysis of temporal correlations among the time-series of the log-returns of 62 stocks in the New York Exchange Market between Jauary 2012 and December 2014. There are three versions of the weighted network, where weights are, respectively, the Pearson correlation coefficient (stocks_62_pearson.net), the distance induced by the Pearson correlation coefficient (stocks_62_distance.net), and the weight obtaied as the inverse of the distance (stocks_62_weight.net).
  • Nodes (N): 62
  • Edges (K): 1891
  • Directed: no
  • Weighted: yes

Financial stocks network (zip)
Back to top