Chapter 16. Cypher Query Language

Table of Contents

16.1. Parameters
16.2. Identifiers
16.3. Start
16.4. Match
16.5. Where
16.6. Return
16.7. Aggregation
16.8. Order by
16.9. Skip
16.10. Limit
16.11. Functions
16.12. Cypher Cookbook

A new query language, code-named “Cypher”, has been added to Neo4j. It allows for expressive and efficient querying of the graph store without having to write traversers in code. Cypher is still growing and maturing, and that means that there probably will be breaking syntax changes. It also means that it has not undergone the same rigorous performance testing as the other components.

Cypher is designed to be a humane query language, suitable for both developers and (importantly, we think) operations professionals who want to make ad-hoc queries on the database. Its constructs are based on English prose and neat iconography, which helps to make it (somewhat) self-explanatory.

Cypher is inspired by a number of different approaches and builds upon established practices for expressive querying. Most of the keywords like WHERE and ORDER BY are inspired by SQL. Pattern matching borrows expression approaches from SPARQL. Regular expression matching is implemented using the Scala programming language.

Cypher is a declarative language. It focuses on the clarity of expressing what to retrieve from a graph, not how to do it, in contrast to imperative languages like Java, and scripting languages like Gremlin (supported via the Section 18.14, “Gremlin Plugin”) and the JRuby Neo4j bindings. This makes the concern of how to optimize queries in implementation detail not exposed to the user.

The query language is comprised of several distinct parts.

Let’s see three of them in action:

Imagine an example graph like

Figure 16.1. Example Graph


For example, here is a query which finds a user called John in an index and then traverses the graph looking for friends of Johns friends (though not his direct friends) before returning both John and any friends-of-friends that are found.

START john=node:node_auto_index(name = 'John')
MATCH john-[:friend]->()-[:friend]->fof
RETURN john, fof

Resulting in

johnfof
2 rows, 2 ms

Node[4]{name->"John"}

Node[2]{name->"Maria"}

Node[4]{name->"John"}

Node[3]{name->"Steve"}

Next up we will add filtering to set all four parts in motion:

In this next example, we take a list of users (by node ID) and traverse the graph looking for those other users that have an outgoing friend relationship, returning only those followed users who have a name property starting with S.

START user=node(5,4,1,2,3)
MATCH user-[:friend]->follower
WHERE follower.name =~ /S.*/
RETURN user, follower.name

Resulting in

userfollower.name
2 rows, 1 ms

Node[5]{name->"Joe"}

Steve

Node[4]{name->"John"}

Sara

In Java, using the query language looks something like this:

ExecutionEngine engine = new ExecutionEngine( db );
ExecutionResult result = engine.execute( "start n=node(0) where 1=1 return n" );

assertThat( result.columns(), hasItem( "n" ) );
Iterator<Node> n_column = result.columnAs( "n" );
assertThat( asIterable( n_column ), hasItem( db.getNodeById( 0 ) ) );
assertThat( result.toString(), containsString( "Node[0]" ) );