19.5. Traversals

19.5.1. Basic traversals
19.5.2. Traversal results
19.5.3. Uniqueness
19.5.4. Ordering
19.5.5. Evaluators - advanced filtering

Traversing is a high-performance way to search your database. The traversal API used here is essentially the same as the one used in the Java API, with a few modifications.

Traversals start at a given node and uses a set of rules to move through the graph and to decide what parts of the graph to return.

You may also want to look at Chapter 15, Cypher Query Language, which is a high level query language for your graph.

19.5.1. Basic traversals

Following a relationship

The most basic traversals simply follow certain relationship types, and return everything they encounter. By default, each node is visited only once, so there is no risk of infinite loops.

traverser = db.traversal()\
    .relationships('related_to')\
    .traverse(start_node)

# The graph is traversed as
# you loop through the result.
for node in traverser.nodes:
    pass

Following a relationship in a specific direction

You can tell the traverser to only follow relationships in some specific direction.

from neo4j import OUTGOING, INCOMING, ANY

traverser = db.traversal()\
    .relationships('related_to', OUTGOING)\
    .traverse(start_node)

Following multiple relationship types

You can specify an arbitrary number of relationship types and directions to follow.

from neo4j import OUTGOING, INCOMING, ANY

traverser = db.traversal()\
    .relationships('related_to', INCOMING)\
    .relationships('likes')\
    .traverse(start_node)

19.5.2. Traversal results

A traversal can give you one of three different result types: nodes, relationships or paths.

Traversals are performed lazily, which means that the graph is traversed as you loop through the result.

traverser = db.traversal()\
    .relationships('related_to')\
    .traverse(start_node)

# Get each possible path
for path in traverser:
    pass

# Get each node
for node in traverser.nodes:
    pass

# Get each relationship
for relationship in traverser.relationships:
    pass

19.5.3. Uniqueness

To avoid infinite loops, it’s important to define what parts of the graph can be re-visited during a traversal. By default, uniqueness is set to NODE_GLOBAL, which means that each node is only visited once.

Here are the other options that are available.

from neo4j import Uniqueness

# Available options are:

Uniqueness.NONE
# Any position in the graph may be revisited.

Uniqueness.NODE_GLOBAL
# Default option
# No node in the entire graph may be visited
# more than once. This could potentially
# consume a lot of memory since it requires
# keeping an in-memory data structure
# remembering all the visited nodes.

Uniqueness.RELATIONSHIP_GLOBAL
# No relationship in the entire graph may be
# visited more than once. For the same
# reasons as NODE_GLOBAL uniqueness, this
# could use up a lot of memory. But since
# graphs typically have a larger number of
# relationships than nodes, the memory
# overhead of this uniqueness level could
# grow even quicker.

Uniqueness.NODE_PATH
# A node may not occur previously in the
# path reaching up to it.

Uniqueness.RELATIONSHIP_PATH
# A relationship may not occur previously in
# the path reaching up to it.

Uniqueness.NODE_RECENT
# Similar to NODE_GLOBAL uniqueness in that
# there is a global collection of visited
# nodes each position is checked against.
# This uniqueness level does however have a
# cap on how much memory it may consume in
# the form of a collection that only
# contains the most recently visited nodes.
# The size of this collection can be
# specified by providing a number as the
# second argument to the
# uniqueness()-method along with the
# uniqueness level.

Uniqueness.RELATIONSHIP_RECENT
# works like NODE_RECENT uniqueness, but
# with relationships instead of nodes.


traverser = db.traversal()\
    .uniqueness(Uniqueness.NODE_PATH)\
    .traverse(start_node)

19.5.4. Ordering

You can traverse either depth first, or breadth first. Depth first is the default, because it has lower memory overhead.

# Depth first traversal, this
# is the default.
traverser = db.traversal()\
    .depthFirst()\
    .traverse(self.source)

# Breadth first traversal
traverser = db.traversal()\
    .breadthFirst()\
    .traverse(start_node)

19.5.5. Evaluators - advanced filtering

In order to traverse based on other critera, such as node properties, or more complex things like neighboring nodes or patterns, we use evaluators. An evaluator is a normal Python method that takes a path as an argument, and returns a description of what to do next.

The path argument is the current position the traverser is at, and the description of what to do can be one of four things, as seen in the example below.

from neo4j import Evaluation

# Evaluation contains the four
# options that an evaluator can
# return. They are:

Evaluation.INCLUDE_AND_CONTINUE
# Include this node in the result and
# continue the traversal

Evaluation.INCLUDE_AND_PRUNE
# Include this node in the result, but don't
# continue the traversal

Evaluation.EXCLUDE_AND_CONTINUE
# Exclude this node from the result, but
# continue the traversal

Evaluation.EXCLUDE_AND_PRUNE
# Exclude this node from the result and
# don't continue the traversal

# An evaluator
def my_evaluator(path):
    # Filter on end node property
    if path.end['message'] == 'world':
        return Evaluation.INCLUDE_AND_CONTINUE

    # Filter on last relationship type
    if path.last_relationship.type.name() == 'related_to':
        return Evaluation.INCLUDE_AND_PRUNE

    # You can do even more complex things here, like subtraversals.

    return Evaluation.EXCLUDE_AND_CONTINUE

# Use the evaluator
traverser = db.traversal()\
    .evaluator(my_evaluator)\
    .traverse(start_node)