23.4. High Availability setup tutorial

23.4.1. Background
23.4.2. Setup and start the Coordinator cluster
23.4.3. Start the Neo4j Servers in HA mode
23.4.4. Start Neo4j Embedded in HA mode

This is a guide to set up a Neo4j HA cluster and run embedded Neo4j or Neo4j Server instances participating as cluster nodes.

23.4.1. Background

The members of the HA cluster (see Chapter 23, High Availability) use a Coordinator cluster to manage themselves and coordinate lifecycle activity like electing a master. When running an Neo4j HA cluster, a Coordinator cluster is used for cluster collaboration and must be installed and configured before working with the Neo4j database HA instances.

[Tip]Tip

Neo4j Server (see Chapter 18, Neo4j Server) and Neo4j Embedded (see Section 22.1, “Introduction”) can both be used as nodes in the same HA cluster. This opens for scenarios where one application can insert and update data via a Java or JVM language based application, and other instances can run Neo4j Server and expose the data via the REST API (Chapter 19, REST API).

Below, there will be 3 coordinator instances set up on one local machine.

Download and unpack Neoj4 Enterprise

Download and unpack three installations of Neo4j Enterprise (called $NEO4J_HOME1, $NEO4J_HOME2, $NEO4J_HOME3) from the Neo4j download site.

23.4.2. Setup and start the Coordinator cluster

Now, in the NEO4J_HOME1/conf/coord.cfg file, adjust the coordinator clientPort and let the coordinator search for other coordinator cluster members at the localhost port ranges:

#$NEO4J_HOME1/conf/coord.cfg

server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

clientPort=2181

The other two config files in $NEO4J_HOME2 and $NEO4J_HOME3 will have a different clienttPort set but the other parameters identical to the first one:

#$NEO4J_HOME2/conf/coord.cfg
...
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
...
clientPort=2182
#$NEO4J_HOME2/conf/coord.cfg
...
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890
...
clientPort=2183

Next we need to create a file in each data directory called "myid" that contains an id for each server equal to the number in server.1, server.2 and server.3 from the configuration files.

neo4j_home1$ echo '1' > data/coordinator/myid
neo4j_home2$ echo '2' > data/coordinator/myid
neo4j_home3$ echo '3' > data/coordinator/myid

We are now ready to start the Coordinator instances:

neo4j_home1$ ./bin/neo4j-coordinator start
neo4j_home2$ ./bin/neo4j-coordinator start
neo4j_home3$ ./bin/neo4j-coordinator start

23.4.3. Start the Neo4j Servers in HA mode

In your conf/neo4j.properties file, enable HA by setting the necessary parameters for all 3 installations, adjusting the ha.server_id for all instances:

#$NEO4J_HOME1/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 1

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183
#$NEO4J_HOME2/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 2

#ip and port for this instance to bind to
ha.server = localhost:6002

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183
#$NEO4J_HOME3/conf/neo4j.properties
#unique server id for this graph database
#can not be negative id and must be unique
ha.server_id = 3

#ip and port for this instance to bind to
ha.server = localhost:6003

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183

To avoid port clashes when starting the servers, adjust the ports for the REST endpoints in all instances under conf/neo4j-server.properties and enable HA mode:

#$NEO4J_HOME1/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA
#$NEO4J_HOME2/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7475
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA
#$NEO4J_HOME3/conf/neo4j-server.properties
...
# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7476
...
# Allowed values:
# HA - High Availability
# SINGLE - Single mode, default.
# To run in High Availability mode, configure the coord.cfg file, and the
# neo4j.properties config file, then uncomment this line:
org.neo4j.server.database.mode=HA

To avoid JMX port clashes adjust the assigned ports for all instances under conf/neo4j-wrapper.properties:

#$NEO4J_HOME1/conf/neo4j-wrapper.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3637
...
#$NEO4J_HOME2/conf/neo4j-wrapper.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3638
...
#$NEO4J_HOME3/conf/neo4j-server.properties
...
# Remote JMX monitoring, adjust the following lines if needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
wrapper.java.additional.4=-Dcom.sun.management.jmxremote.port=3639
...

Now, start all three server instances.

neo4j_home1$ ./bin/neo4j start
neo4j_home2$ ./bin/neo4j start
neo4j_home3$ ./bin/neo4j start

Now, you should be able to access the 3 servers (the first one being elected as master since it was started first) at http://localhost:7474/webadmin/#/info/org.neo4j/High%20Availability/, http://localhost:7475/webadmin/#/info/org.neo4j/High%20Availability/ and http://localhost:7476/webadmin/#/info/org.neo4j/High%20Availability/ and check the status of the HA configuration. Alternatively, the REST API is exposing JMX, so you can check the HA JMX bean with e.g.

curl -H "Content-Type:application/json" -d '["org.neo4j:*"]' http://localhost:7474/db/manage/server/jmx/query

And find in the response

"description" : "Information about all instances in this cluster",
    "name" : "InstancesInCluster",
    "value" : [ {
      "description" : "org.neo4j.management.InstanceInfo",
      "value" : [ {
        "description" : "address",
        "name" : "address"
      }, {
        "description" : "instanceId",
        "name" : "instanceId"
      }, {
        "description" : "lastCommittedTransactionId",
        "name" : "lastCommittedTransactionId",
        "value" : 1
      }, {
        "description" : "machineId",
        "name" : "machineId",
        "value" : 1
      }, {
        "description" : "master",
        "name" : "master",
        "value" : true
      } ],
      "type" : "org.neo4j.management.InstanceInfo"
    }

23.4.4. Start Neo4j Embedded in HA mode

If you are using Maven and Neo4j Embedded, simply add the following dependency to your project:

<dependency>
   <groupId>org.neo4j</groupId>
   <artifactId>neo4j-ha</artifactId>
   <version>${neo4j-version}</version>
</dependency>

Where ${neo4j-version} is the Neo4j version used.

If you prefer to download the jar files manually, they are included in the Neo4j distribution.

The difference in code when using Neo4j-HA is the creation of the graph database service.

GraphDatabaseService db = new HighlyAvailableGraphDatabase( path, config );

The configuration can contain the standard configuration parameters (provided as part of the config above or in neo4j.properties but will also have to contain:

#HA instance1
#unique machine id for this graph database
#can not be negative id and must be unique
ha.server_id = 1

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183

enable_remote_shell = port=1331

First we need to create a database that can be used for replication. This is easiest done by just starting a normal embedded graph database, pointing out a path and shutdown.

Map<String,String> config = HighlyAvailableGraphDatabase.loadConfigurations( configFile );
GraphDatabaseService db = new HighlyAvailableGraphDatabase( path, config );

We created a config file with machine id=1 and enabled remote shell. The main method will expect the path to the db as first parameter and the configuration file as the second parameter.

It should now be possible to connect to the instance using Chapter 28, Neo4j Shell:

neo4j_home1$ ./bin/neo4j-shell -port 1331
NOTE: Remote Neo4j graph database service 'shell' at port 1331
Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ hainfo
I'm currently master
Connected slaves:

Since it is the first instance to join the cluster it is elected master. Starting another instance would require a second configuration and another path to the db.

#HA instance2
#unique machine id for this graph database
#can not be negative id and must be unique
ha.server_id = 2

#ip and port for this instance to bind to
ha.server = localhost:6001

#connection information to the coordinator cluster client ports
ha.coordinators = localhost:2181,localhost:2182,localhost:2183

enable_remote_shell = port=1332

Now start the shell connecting to port 1332:

neo4j_home1$ ./bin/neo4j-shell -port 1332
NOTE: Remote Neo4j graph database service 'shell' at port 1332
Welcome to the Neo4j Shell! Enter 'help' for a list of commands

neo4j-sh (0)$ hainfo
I'm currently slave