User-defined procedures

This describes how to write, test, and deploy a user-defined procedure for Neo4j.

A user-defined procedure is a mechanism that enables you to extend Neo4j by writing customized code, which can be invoked directly from Cypher. Procedures can take arguments, perform operations on the database, and return results. For a comparison between user-defined procedures, functions, and aggregation functions see Neo4j customized code.

Call a procedure

To call a user-defined procedure, use a Cypher CALL clause. The procedure name must be fully qualified, so a procedure named findDenseNodes defined in the package org.neo4j.examples could be called using:

CALL org.neo4j.examples.findDenseNodes(1000)

A CALL may be the only clause within a Cypher statement or may be combined with other clauses. Arguments can be supplied directly within the query or taken from the associated parameter set. For full details, see the documentation in Cypher Manual → CALL procedure.

Create a procedure

Make sure you have read and followed the preparatory setup instructions in Setting up a plugin project.

The example discussed below is available as a repository on GitHub. To get started quickly you can fork the repository and work with the code as you follow along in the guide below.

  1. Create integration tests.

  2. Define the procedure.

Integration tests

The test dependencies include Neo4j Harness and JUnit. These can be used to write integration tests for procedures.

First, decide what the procedure should do, then write a test that proves that it does it right. Finally, write a procedure that passes the test.

Below is a template for testing a procedure that accesses Neo4j’s full-text indexes from Cypher.

package example;

import org.junit.Rule;
import org.junit.Test;
import org.neo4j.driver.v1.*;
import org.neo4j.graphdb.factory.GraphDatabaseSettings;
import org.neo4j.harness.junit.Neo4jRule;

import static org.hamcrest.core.IsEqual.equalTo;
import static org.junit.Assert.assertThat;
import static org.neo4j.driver.v1.Values.parameters;

public class ManualFullTextIndexTest
{
    // This rule starts a Neo4j instance
    @Rule
    public Neo4jRule neo4j = new Neo4jRule()

            // This is the Procedure to test
            .withProcedure( FullTextIndex.class );

    @Test
    public void shouldAllowIndexingAndFindingANode() throws Throwable
    {
        // In a try-block, to make sure you close the driver after the test
        try( Driver driver = GraphDatabase.driver( neo4j.boltURI() , Config.build().withoutEncryption().toConfig() ) )
        {

            // Given I've started Neo4j with the FullTextIndex procedure class
            //       which my 'neo4j' rule above does.
            Session session = driver.session();

            // And given I have a node in the database
            long nodeId = session.run( "CREATE (p:User {name:'Brookreson'}) RETURN id(p)" )
                    .single()
                    .get( 0 ).asLong();

            // When I use the index procedure to index a node
            session.run( "CALL example.index($id, ['name'])", parameters( "id", nodeId ) );

            // Then I can search for that node with lucene query syntax
            StatementResult result = session.run( "CALL example.search('User', 'name:Brook*')" );
            assertThat( result.single().get( "nodeId" ).asLong(), equalTo( nodeId ) );
        }
    }
}

Define a procedure

With the test in place, write a procedure that fulfills the expectations of the test. The full example is available in the Neo4j Procedure Template repository.

Particular things to note:

  • All procedures are annotated @Procedure.

  • The procedure annotation can take three optional arguments: name, mode, and eager.

    • name is used to specify a different name for the procedure than the default generated, which is class.path.nameOfMethod. If mode is specified then name must be specified as well.

    • mode is used to declare the types of interactions that the procedure will perform. A procedure will fail if it attempts to execute database operations that violates its mode. The default mode is READ. The following modes are available:

      • READ This procedure will only perform read operations against the graph.

      • WRITE This procedure will perform read and write operations against the graph.

      • SCHEMA This procedure will perform operations against the schema, i.e. create and drop indexes and constraints. A procedure with this mode is able to read graph data, but not write.

      • DBMS This procedure will perform system operations such as user management and query management. A procedure with this mode is not able to read or write graph data.

    • eager is a boolean setting defaulting to false. If it is set to true then the Cypher planner will plan an extra eager operation before calling the procedure. This is useful in cases where the procedure makes changes to the database in a way that could interact with the operations preceding the procedure. For example:

      MATCH (n)
      WHERE n.key = 'value'
      WITH n
      CALL deleteNeighbours(n, 'FOLLOWS')

      This query could delete some of the nodes that would be matched by the Cypher query, and then the n.key lookup will fail. Marking this procedure as eager will prevent this from causing an error in Cypher code. However, it is still possible for the procedure to interfere with itself by trying to read entities it has previously deleted. It is the responsibility of the procedure author to handle that case.

  • The context of the procedure, which is the same as each resource that the procedure wants to use, is annotated @Context.

The correct way to signal an error from within a procedure is to throw a RuntimeException.

package example;

import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.stream.Stream;

import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Label;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.index.Index;
import org.neo4j.graphdb.index.IndexManager;
import org.neo4j.logging.Log;
import org.neo4j.procedure.Context;
import org.neo4j.procedure.Name;
import org.neo4j.procedure.PerformsWrites;
import org.neo4j.procedure.Procedure;

import static org.neo4j.helpers.collection.MapUtil.stringMap;
import static org.neo4j.procedure.Procedure.Mode.SCHEMA;
import static org.neo4j.procedure.Procedure.Mode.WRITE;

/**
 * This is an example showing how you could expose Neo4j's full-text indexes as
 * two procedures - one for updating indexes, and one for querying by label and
 * the lucene query language.
 */
public class FullTextIndex
{
    // Only static fields and @Context-annotated fields are allowed in
    // Procedure classes. This static field is the configuration we use
    // to create full-text indexes.
    private static final Map<String,String> FULL_TEXT =
            stringMap( IndexManager.PROVIDER, "lucene", "type", "fulltext" );

    // This field declares that we need a GraphDatabaseService
    // as context when any procedure in this class is invoked
    @Context
    public GraphDatabaseService db;

    // This gives us a log instance that outputs messages to the
    // standard log, `neo4j.log`
    @Context
    public Log log;

    /**
     * This declares the first of two procedures in this class - a
     * procedure that performs queries in a manual index.
     *
     * It returns a Stream of Records, where records are
     * specified per procedure. This particular procedure returns
     * a stream of {@link SearchHit} records.
     *
     * The arguments to this procedure are annotated with the
     * {@link Name} annotation and define the position, name
     * and type of arguments required to invoke this procedure.
     * There is a limited set of types you can use for arguments,
     * these are as follows:
     *
     * <ul>
     *     <li>{@link String}</li>
     *     <li>{@link Long} or {@code long}</li>
     *     <li>{@link Double} or {@code double}</li>
     *     <li>{@link Number}</li>
     *     <li>{@link Boolean} or {@code boolean}</li>
     *     <li>{@link org.neo4j.graphdb.Node}</li>
     *     <li>{@link org.neo4j.graphdb.Relationship}</li>
     *     <li>{@link org.neo4j.graphdb.Path}</li>
     *     <li>{@link java.util.Map} with key {@link String} and value of any type in this list, including {@link java.util.Map}</li>
     *     <li>{@link java.util.List} of elements of any valid field type, including {@link java.util.List}</li>
     *     <li>{@link Object}, meaning any of the types above</li>
     *
     * @param label the label name to query by
     * @param query the lucene query, for instance `name:Brook*` to
     *              search by property `name` and find any value starting
     *              with `Brook`. Please refer to the Lucene Query Parser
     *              documentation for full available syntax.
     * @return the nodes found by the query
     */
    @Procedure( name = "example.search", mode = WRITE )
    public Stream<SearchHit> search( @Name("label") String label,
                                     @Name("query") String query )
    {
        String index = indexName( label );

        // Avoid creating the index, if it is not there we will not be
        // finding anything anyway!
        if( !db.index().existsForNodes( index ))
        {
            // Just to show how you would do logging
            log.debug( "Skipping index query since index does not exist: `%s`", index );
            return Stream.empty();
        }

        // If there is an index, do a lookup and convert the result
        // to our output record.
        return db.index()
                .forNodes( index )
                .query( query )
                .stream()
                .map( SearchHit::new );
    }

    /**
     * This is the second procedure defined in this class, it is used to update the
     * index with nodes that should be queryable. You can send the same node multiple
     * times, if it already exists in the index the index will be updated to match
     * the current state of the node.
     *
     * This procedure works largely the same as {@link #search(String, String)},
     * with three notable differences. One, it is annotated with `mode = SCHEMA`,
     * which is <i>required</i> if you want to perform updates to the graph in your
     * procedure.
     *
     * Two, it returns {@code void} rather than a stream. This is a short-hand
     * for saying our procedure always returns an empty stream of empty records.
     *
     * Three, it uses a default value for the property list, in this way you can call
     * the procedure by invoking {@code CALL index(nodeId)}. Default values are
     * are provided as the Cypher string representation of the given type, e.g.
     * {@code {default: true}}, {@code null}, or {@code -1}.
     *
     * @param nodeId the id of the node to index
     * @param propKeys a list of property keys to index, only the ones the node
     *                 actually contains will be added
     */
    @Procedure( name = "example.index", mode = SCHEMA )
    public void index( @Name("nodeId") long nodeId,
                       @Name(value = "properties", defaultValue = "[]") List<String> propKeys )
    {
        Node node = db.getNodeById( nodeId );

        // Load all properties for the node once and in bulk,
        // the resulting set will only contain those properties in `propKeys`
        // that the node actually contains.
        Set<Map.Entry<String,Object>> properties =
                node.getProperties( propKeys.toArray( new String[0] ) ).entrySet();

        // Index every label (this is just as an example, we could filter which labels to index)
        for ( Label label : node.getLabels() )
        {
            Index<Node> index = db.index().forNodes( indexName( label.name() ), FULL_TEXT );

            // In case the node is indexed before, remove all occurrences of it so
            // we do not get old or duplicated data
            index.remove( node );

            // And then index all the properties
            for ( Map.Entry<String,Object> property : properties )
            {
                index.add( node, property.getKey(), property.getValue() );
            }
        }
    }


    /**
     * This is the output record for our search procedure. All procedures
     * that return results return them as a Stream of Records, where the
     * records are defined like this one - customized to fit what the procedure
     * is returning.
     *
     * The fields must be one of the following types:
     *
     * <ul>
     *     <li>{@link String}</li>
     *     <li>{@link Long} or {@code long}</li>
     *     <li>{@link Double} or {@code double}</li>
     *     <li>{@link Number}</li>
     *     <li>{@link Boolean} or {@code boolean}</li>
     *     <li>{@link org.neo4j.graphdb.Node}</li>
     *     <li>{@link org.neo4j.graphdb.Relationship}</li>
     *     <li>{@link org.neo4j.graphdb.Path}</li>
     *     <li>{@link java.util.Map} with key {@link String} and value {@link Object}</li>
     *     <li>{@link java.util.List} of elements of any valid field type, including {@link java.util.List}</li>
     *     <li>{@link Object}, meaning any of the valid field types</li>
     * </ul>
     */
    public static class SearchHit
    {
        // This records contain a single field named 'nodeId'
        public long nodeId;

        public SearchHit( Node node )
        {
            this.nodeId = node.getId();
        }
    }

    private String indexName( String label )
    {
        return "label-" + label;
    }
}

Injectable resources

When writing procedures, some resources can be injected into the procedure from the database. To inject these, use the @Context annotation. The classes that can be injected are:

  • Log

  • TerminationGuard

  • GraphDatabaseService

All of the above classes are considered safe and future-proof, and will not compromise the security of the database. There are also several classes that can be injected that are unsupported (restricted) and can be changed with little or no notice. Procedures written to use these restricted API’s will not be loaded by default, and it will be necessary to use the dbms.security.procedures.unrestricted to load unsafe procedures. Read more about this config setting in Operations Manual → Securing extensions.