16.7. Match

Prev		Next

16.7.1. introduction

Pattern matching is one of the pillars of Cypher. The pattern is used to describe the shape of the data that we are looking for. Cypher will then try to find patterns in the graph — these are called matching sub graphs.

The description of the pattern is made up of one or more paths, separated by commas. A path is a sequence of nodes and relationships that always start and end in nodes. An example path would be: (a)-->(b)

Paths can be of arbitrary length, and the same node may appear in multiple places in the path. Node identifiers can be used with or without surrounding parenthesis. The following two match clauses are semantically identical — the difference is purely aesthetic.

MATCH (a)-->(b)

and

MATCH a-->b

Patterns have bound points, or start points. They are the parts of the pattern that are already “bound” to a set of graph nodes or relationships. All parts of the pattern must be directly or indirectly connected to a start point — a pattern where parts of the pattern are not reachable from any start point will be rejected.

The optional relationship is a way to describe parts of the pattern that can evaluate to null if it can not be matched to the graph. It’s the equivalent of SQL outer join — if Cypher finds one or more matches, they will be returned. If no matches are found, Cypher will return a null. Only relationships can be marked as optional, and it’s done with a question mark.

Optional relationships of the pattern are used to answer queries like this:

START me=node(1)
MATCH me-->friend-[?:parent_of]->children
RETURN friend, children

The query above says “give me all my friends, and their children, if they have any.”

Optionality is transitive — if a part of the pattern can only be reached from a bound point through an optional relationship, that part is also optional. In the pattern above, the only bound point in the pattern is me. Since the relationship between friend and children is optional, children is an optional part of the graph.

Also, named paths that contain optional parts are also optional — if any part of the path is null, the whole path is null.

In these examples, b and p are all optional and can contain null:

START a=node(1)
MATCH p = a-[?]->b
RETURN b

START a=node(1)
MATCH p = a-[?*]->b
RETURN b

START a=node(1)
MATCH p = a-[?]->x-->b
RETURN b

START a=node(1), x=node(2)
MATCH p = shortestPath( a-[?*]->x )
RETURN p

As a simple example, let’s take the following query, executed on the graph pictured below.

Query

START me=node(1)
MATCH me-->friend-[?:parent_of]->children
RETURN friend, children

This returns the a friend node, and no children, since there are no such relatoinships in the graph.

Result

friend	children
1 row, 1 ms
`Node[3]{name->"Anders"}`	`<null>`

For the examples given in the sections below, the follwoing graph is the base:

Graph

16.7.2. Related nodes

The symbol -- means related to, without regard to type or direction.

Query

START n=node(3)
MATCH (n)--(x)
RETURN x

All nodes related to A (Anders) are returned.

Result

x
3 rows, 0 ms
`Node[4]{name->"Bossman"}`
`Node[1]{name->"David"}`
`Node[5]{name->"Cesar"}`

16.7.3. Outgoing relationships

When the direction of a relationship is interesting, it is shown by using --> or <--, like this:

Query

START n=node(3)
MATCH (n)-->(x)
RETURN x

All nodes that A has outgoing relationships to.

Result

x
2 rows, 1 ms
`Node[4]{name->"Bossman"}`
`Node[5]{name->"Cesar"}`

16.7.4. Directed relationships and identifier

If an identifier is needed, either for filtering on properties of the relationship, or to return the relationship, this is how you introduce the identifier.

Query

START n=node(3)
MATCH (n)-[r]->()
RETURN r

All outgoing relationships from node A.

Result

r
2 rows, 0 ms
`:KNOWS[0] {}`
`:BLOCKS[1] {}`

16.7.5. Match by relationship type

When you know the relationship type you want to match on, you can specify it by using a colon.

Query

START n=node(3)
MATCH (n)-[:BLOCKS]->(x)
RETURN x

All nodes that are BLOCKed by A.

Result

x
1 row, 0 ms
`Node[5]{name->"Cesar"}`

16.7.6. Match by multiple relationship types

If multiple types are acceptable, you can specify this by chaining them with the pipe symbol |

Query

START n=node(3)
MATCH (n)-[:BLOCKS|KNOWS]->(x)
RETURN x

All nodes with a BLOCK or KNOWS relationship to A.

Result

x
2 rows, 0 ms
`Node[5]{name->"Cesar"}`
`Node[4]{name->"Bossman"}`

16.7.7. Match by relationship type and use an identifier

If you both want to introduce an identifier to hold the relationship, and specify the relationship type you want, just add them both, like this.

Query

START n=node(3)
MATCH (n)-[r:BLOCKS]->()
RETURN r

All BLOCKS relationship going out from A.

Result

r
1 row, 1 ms
`:BLOCKS[1] {}`

16.7.8. Relationship types with uncommon characters

Sometime your database will have types with non-letter characters, or with spaces in them. Use ` to escape these.

Query

START n=node(3)
MATCH (n)-[r:`TYPE WITH SPACE IN IT`]->()
RETURN r

This returns a relationship of a type with spaces in it.

Result

r
1 row, 0 ms
`:TYPE WITH SPACE IN IT[6] {}`

16.7.9. Multiple relationships

Relationships can be expressed by using multiple statements in the form of ()--(), or they can be strung together, like this:

Query

START a=node(3)
MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c)
RETURN a,b,c

The three nodes in the path.

Result

a	b	c
1 row, 0 ms
`Node[3]{name->"Anders"}`	`Node[4]{name->"Bossman"}`	`Node[2]{name->"Emil"}`

16.7.10. Variable length relationships

Nodes that are a variable number of relationship→node hops away can be found using -[:TYPE*minHops..maxHops]->.

Query

START a=node(3), x=node(2, 4)
MATCH a-[:KNOWS*1..3]->x
RETURN a,x

Returns the start and end point, if there is a path between 1 and 3 relationships away.

Result

a	x
2 rows, 1 ms
`Node[3]{name->"Anders"}`	`Node[2]{name->"Emil"}`
`Node[3]{name->"Anders"}`	`Node[4]{name->"Bossman"}`

16.7.11. Relationship identifier in variable length relationships

When the connection between two nodes is of variable length, a relationship identifier becomes an iterable of relationships.

Query

START a=node(3), x=node(2, 4)
MATCH a-[r:KNOWS*1..3]->x
RETURN r

Returns the relationships, if there is a path between 1 and 3 relationships away.

Result

r
2 rows, 1 ms
`[:KNOWS[0] {},:KNOWS[3] {}]`
`[:KNOWS[0] {}]`

16.7.12. Zero length paths

When using variable length paths that have the lower bound zero, it means that two identifiers can point to the same node. If the distance between two nodes is zero, they are, by definition, the same node.

Query

START a=node(3)
MATCH p1=a-[:KNOWS*0..1]->b, p2=b-[:BLOCKS*0..1]->c
RETURN a,b,c, length(p1), length(p2)

This query will return four paths, some of them with length zero.

Result

a	b	c	length(p1)	length(p2)
4 rows, 1 ms
`Node[3]{name->"Anders"}`	`Node[3]{name->"Anders"}`	`Node[3]{name->"Anders"}`	`0`	`0`
`Node[3]{name->"Anders"}`	`Node[3]{name->"Anders"}`	`Node[5]{name->"Cesar"}`	`0`	`1`
`Node[3]{name->"Anders"}`	`Node[4]{name->"Bossman"}`	`Node[4]{name->"Bossman"}`	`1`	`0`
`Node[3]{name->"Anders"}`	`Node[4]{name->"Bossman"}`	`Node[1]{name->"David"}`	`1`	`1`

16.7.13. Optional relationship

If a relationship is optional, it can be marked with a question mark. This is similar to how a SQL outer join works. If the relationship is there, it is returned. If it’s not, null is returned in it’s place. Remember that anything hanging off an optional relationship, is in turn optional, unless it is connected with a bound node some other path.

Query

START a=node(2)
MATCH a-[?]->x
RETURN a,x

A node, and null, since the node has no outgoing relationships.

Result

a	x
1 row, 1 ms
`Node[2]{name->"Emil"}`	`<null>`

16.7.14. Optional typed and named relationship

Just as with a normal relationship, you can decide which identifier it goes into, and what relationship type you need.

Query

START a=node(3)
MATCH a-[r?:LOVES]->()
RETURN a,r

A node, and null, since the node has no outgoing LOVES relationships.

Result

a	r
1 row, 0 ms
`Node[3]{name->"Anders"}`	`<null>`

16.7.15. Properties on optional elements

Returning a property from an optional element that is null will also return null.

Query

START a=node(2)
MATCH a-[?]->x
RETURN x, x.name

The element x (null in this query), and null as it’s name.

Result

x	x.name
1 row, 1 ms
`<null>`	`<null>`

16.7.16. Complex matching

Using Cypher, you can also express more complex patterns to match on, like a diamond shape pattern.

Query

START a=node(3)
MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c), (a)-[:BLOCKS]-(d)-[:KNOWS]-(c)
RETURN a,b,c,d

The four nodes in the paths.

Result

a	b	c	d
1 row, 0 ms
`Node[3]{name->"Anders"}`	`Node[4]{name->"Bossman"}`	`Node[2]{name->"Emil"}`	`Node[5]{name->"Cesar"}`

16.7.17. Shortest path

Finding a single shortest path between two nodes is as easy as using the shortestPath-function, like this:

Query

START d=node(1), e=node(2)
MATCH p = shortestPath( d-[*..15]->e )
RETURN p

This means: find a single shortest path between two nodes, as long as the path is max 15 relationships long. Inside of the parenthesis you write a single link of a path — the starting node, the connecting relationship and the end node. Characteristics describing the relationship like relationship type, max hops and direction are all used when finding the shortest path. You can also mark the path as optional.

Result

p
1 row, 1 ms
`(1)--[KNOWS,2]-->(3)--[KNOWS,0]-->(4)--[KNOWS,3]-->(2)`

16.7.18. All shortest paths

Finds all the shortest paths between two nodes.

Query

START d=node(1), e=node(2)
MATCH p = allShortestPaths( d-[*..15]->e )
RETURN p

This will find the two directed paths between David and Emil.

Result

p
2 rows, 0 ms
`(1)--[KNOWS,2]-->(3)--[KNOWS,0]-->(4)--[KNOWS,3]-->(2)`
`(1)--[KNOWS,2]-->(3)--[BLOCKS,1]-->(5)--[KNOWS,4]-->(2)`

16.7.19. Named path

If you want to return or filter on a path in your pattern graph, you can a introduce a named path.

Query

START a=node(3)
MATCH p = a-->b
RETURN p

The two paths starting from the first node.

Result

p
2 rows, 1 ms
`[Node[3]{name->"Anders"},:KNOWS[0] {},Node[4]{name->"Bossman"}]`
`[Node[3]{name->"Anders"},:BLOCKS[1] {},Node[5]{name->"Cesar"}]`

16.7.20. Matching on a bound relationship

When your pattern contains a bound relationship, and that relationship pattern doesn specify direction, Cypher will try to match the relationship where the connected nodes switch sides.

Query

START a=node(3), b=node(2)
MATCH a-[?:KNOWS]-x-[?:KNOWS]-b
RETURN x

This returns the two connected nodes, once as the start node, and once as the end node

Result

x
3 rows, 2 ms
`Node[4]{name->"Bossman"}`
`Node[1]{name->"David"}`
`Node[5]{name->"Cesar"}`