7.13. Calculating the clustering coefficient of a network

Figure 7.11. Graph


In this example, adapted from Niko Gamulins blog post on Neo4j for Social Network Analysis, the graph in question is showing the 2-hop relationships of a sample person as nodes with KNOWS relationships.

The clustering coefficient of a selected node is defined as the probability that two randomly selected neighbors are connected to each other. With the number of neighbors as n and the number of mutual connections between the neighbors r the calculation is:

The number of possible connections between two neighbors is n!/(2!(n-2)!) = 4!/(2!(4-2)!) = 24/4 = 6, where n is the number of neighbors n = 4 and the actual number r of connections is 1. Therefore the clustering coefficient of node 1 is 1/6.

n and r are quite simple to retrieve via the following query:

Query. 

START a = node(*)
MATCH (a)--(b)
WITH a, count(distinct b) as n
MATCH (a)--()-[r]-()--(a)
WHERE a.name! = "startnode"
RETURN n, count(distinct r) as r

This returns n and r for the above calculations.

Result

nr
1 row

4

1


Try this query live. (1) {"name":"startnode"} (2) {} (3) {} (4) {} (5) {} (6) {} (7) {} (1)-[:KNOWS]->(2) {} (1)-[:KNOWS]->(3) {} (1)-[:KNOWS]->(4) {} (1)-[:KNOWS]->(5) {} (2)-[:KNOWS]->(6) {} (2)-[:KNOWS]->(7) {} (3)-[:KNOWS]->(4) {} START a = node(*) MATCH (a)--(b) WITH a, count(distinct b) as n MATCH (a)--()-[r]-()--(a) WHERE a.name! = "startnode" RETURN n, count(distinct r) as r