Pattern matching is one of the pillars of Cypher. The pattern is used to describe the shape of the data that we are looking for. Cypher will then try to find patterns in the graph — these are called matching sub graphs.
The description of the pattern is made up of one or more paths, separated by commas. A path is a sequence of nodes and
relationships that always start and end in nodes. An example path would be:
(a)-->(b)
Paths can be of arbitrary length, and the same node may appear in multiple places in the path. Node identifiers can be used with or without surrounding parenthesis. The following two match clauses are semantically identical — the difference is purely aesthetic.
MATCH (a)-->(b)
and
MATCH a-->b
Patterns have bound points, or start points. They are the parts of the pattern that are already “bound” to a set of graph nodes or relationships. All parts of the pattern must be directly or indirectly connected to a start point — a pattern where parts of the pattern are not reachable from any start point will be rejected.
The optional relationship is a way to describe parts of the pattern that can evaluate to null
if it can not be
matched to the graph. It’s the equivalent of SQL outer join — if Cypher finds one or more matches, they will be
returned. If no matches are found, Cypher will return a null
. Only relationships can be marked as optional, and it’s
done with a question mark.
Optional relationships of the pattern are used to answer queries like this:
START me=node(1) MATCH me-->friend-[?:parent_of]->children RETURN friend, children
The query above says “give me all my friends, and their children, if they have any.”
Optionality is transitive — if a part of the pattern can only be reached from a bound point through an optional relationship,
that part is also optional. In the pattern above, the only bound point in the pattern is me
. Since the relationship
between friend
and children
is optional, children
is an optional part of the graph.
Also, named paths that contain optional parts are also optional — if any part of the path is
null
, the whole path is null
.
In these examples, b
and p
are all optional and can contain null
:
START a=node(1) MATCH p = a-[?]->b RETURN b
START a=node(1) MATCH p = a-[?*]->b RETURN b
START a=node(1) MATCH p = a-[?]->x-->b RETURN b
START a=node(1), x=node(2) MATCH p = shortestPath( a-[?*]->x ) RETURN p
As a simple example, let’s take the following query, executed on the graph pictured below.
Query
START me=node(1) MATCH me-->friend-[?:parent_of]->children RETURN friend, children
This returns the a friend
node, and no children
, since there are no such relatoinships in the graph.
For the examples given in the sections below, the follwoing graph is the base:
Graph
The symbol --
means related to, without regard to type or direction.
Query
START n=node(3) MATCH (n)--(x) RETURN x
All nodes related to A (Anders) are returned.
When the direction of a relationship is interesting, it is shown by using -->
or <--
, like this:
Query
START n=node(3) MATCH (n)-->(x) RETURN x
All nodes that A has outgoing relationships to.
If an identifier is needed, either for filtering on properties of the relationship, or to return the relationship, this is how you introduce the identifier.
Query
START n=node(3) MATCH (n)-[r]->() RETURN r
All outgoing relationships from node A.
When you know the relationship type you want to match on, you can specify it by using a colon.
Query
START n=node(3) MATCH (n)-[:BLOCKS]->(x) RETURN x
All nodes that are BLOCKed by A.
If multiple types are acceptable, you can specify this by chaining them with the pipe symbol |
Query
START n=node(3) MATCH (n)-[:BLOCKS|KNOWS]->(x) RETURN x
All nodes with a BLOCK
or KNOWS
relationship to A.
If you both want to introduce an identifier to hold the relationship, and specify the relationship type you want, just add them both, like this.
Query
START n=node(3) MATCH (n)-[r:BLOCKS]->() RETURN r
All BLOCKS
relationship going out from A.
Sometime your database will have types with non-letter characters, or with spaces in them. Use ` to escape these.
Query
START n=node(3) MATCH (n)-[r:`TYPE WITH SPACE IN IT`]->() RETURN r
This returns a relationship of a type with spaces in it.
Relationships can be expressed by using multiple statements in the form of ()--()
, or they can be strung together, like this:
Query
START a=node(3) MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c) RETURN a,b,c
The three nodes in the path.
Nodes that are a variable number of relationship→node hops away can be found using -[:TYPE*minHops..maxHops]->
.
Query
START a=node(3), x=node(2, 4) MATCH a-[:KNOWS*1..3]->x RETURN a,x
Returns the start and end point, if there is a path between 1 and 3 relationships away.
Result
a | x |
---|---|
2 rows, 1 ms | |
|
|
|
|
When the connection between two nodes is of variable length, a relationship identifier becomes an iterable of relationships.
Query
START a=node(3), x=node(2, 4) MATCH a-[r:KNOWS*1..3]->x RETURN r
Returns the relationships, if there is a path between 1 and 3 relationships away.
When using variable length paths that have the lower bound zero, it means that two identifiers can point to the same node. If the distance between two nodes is zero, they are, by definition, the same node.
Query
START a=node(3) MATCH p1=a-[:KNOWS*0..1]->b, p2=b-[:BLOCKS*0..1]->c RETURN a,b,c, length(p1), length(p2)
This query will return four paths, some of them with length zero.
Result
a | b | c | length(p1) | length(p2) |
---|---|---|---|---|
4 rows, 1 ms | ||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If a relationship is optional, it can be marked with a question mark. This is similar to how a SQL outer join works. If the relationship is there, it is returned. If it’s not, null
is returned in it’s place. Remember that anything hanging off an optional relationship, is in turn optional, unless it is connected with a bound node some other path.
Query
START a=node(2) MATCH a-[?]->x RETURN a,x
A node, and null
, since the node has no outgoing relationships.
Just as with a normal relationship, you can decide which identifier it goes into, and what relationship type you need.
Query
START a=node(3) MATCH a-[r?:LOVES]->() RETURN a,r
A node, and null
, since the node has no outgoing LOVES
relationships.
Returning a property from an optional element that is null
will also return null
.
Query
START a=node(2) MATCH a-[?]->x RETURN x, x.name
The element x (null
in this query), and null
as it’s name.
Using Cypher, you can also express more complex patterns to match on, like a diamond shape pattern.
Query
START a=node(3) MATCH (a)-[:KNOWS]->(b)-[:KNOWS]->(c), (a)-[:BLOCKS]-(d)-[:KNOWS]-(c) RETURN a,b,c,d
The four nodes in the paths.
Result
a | b | c | d |
---|---|---|---|
1 row, 0 ms | |||
|
|
|
|
Finding a single shortest path between two nodes is as easy as using the shortestPath
-function, like this:
Query
START d=node(1), e=node(2) MATCH p = shortestPath( d-[*..15]->e ) RETURN p
This means: find a single shortest path between two nodes, as long as the path is max 15 relationships long. Inside of the parenthesis you write a single link of a path — the starting node, the connecting relationship and the end node. Characteristics describing the relationship like relationship type, max hops and direction are all used when finding the shortest path. You can also mark the path as optional.
Finds all the shortest paths between two nodes.
Query
START d=node(1), e=node(2) MATCH p = allShortestPaths( d-[*..15]->e ) RETURN p
This will find the two directed paths between David and Emil.
Result
p |
---|
2 rows, 0 ms |
|
|
If you want to return or filter on a path in your pattern graph, you can a introduce a named path.
Query
START a=node(3) MATCH p = a-->b RETURN p
The two paths starting from the first node.
Result
p |
---|
2 rows, 1 ms |
|
|
When your pattern contains a bound relationship, and that relationship pattern doesn specify direction, Cypher will try to match the relationship where the connected nodes switch sides.
Query
START a=node(3), b=node(2) MATCH a-[?:KNOWS]-x-[?:KNOWS]-b RETURN x
This returns the two connected nodes, once as the start node, and once as the end node
Copyright © 2012 Neo Technology