26.6. High Availability setup tutorial

26.6.1. Background
26.6.2. Getting started: Setting up a production cluster
26.6.3. Alternative setup: Creating a local cluster for testing

This guide will help you understand the ways to configure and deploy a Neo4j High Availability cluster. Two scenarios will be considered: The first will be configuring 3 instances to be deployed on 3 separate machines, in a setting similar to what might be encountered in a production environment. The second will modify the former to make it possible to run a cluster of 3 instances on the same physical machine, a setup most useful during development of applications on top of Neo4j.

26.6.1. Background

A Neo4j HA cluster consists of a set of running Neo4j Enterprise instances. Each instance must be assigned an integer ID, which serves as its identifier in the cluster.

At startup, a Neo4j instance contacts the other instances specified in the ha.initial_hosts configuration option. When it establishes a connection to any one of these, it determines the current state of the cluster and ensures that it is eligible to join the cluster. To be eligible the Neo4j instance must be hosting the same database store as other members of the cluster (although it is allowed to be in an older state), or be a new deployment without a database store.

First, let’s explain what settings exist and what values they accept.

ha.server_id

ha.server_id is the cluster identification for each instance. It must be a positive integer, unique among all Neo4j instances in the cluster.

ha.cluster_server

ha.cluster_server is an address/port setting that specifies where the Neo4j instance will listen for cluster communications. This is related with the ha.initial_hosts setting. The default is 0.0.0.0:5001, which is suitable for most deployments.

ha.server

ha.server is an additional address/port setting that specifies where the Neo4j instance will listen for content. It must be different from ha.cluster_server, typically with a different port. The default is 0.0.0.0:6361, which is suitable for most deployments.

ha.initial_hosts

ha.initial_hosts is a comma separated list of hostname/port pairs, which specify how to reach other Neo4j instances in the cluster (as configured via their ha.cluster_server option). These hostname/ports will be used when the Neo4j instances starts, to allow it up to find and join the cluster. Note the specifying the instances own address is permitted.

[Tip]Address/port format

The options ha.cluster_server and ha.server are specified as <IP address>:<port>. The IP address MUST be the address assigned to one of the servers network interfaces, or the value 0.0.0.0, which will cause Neo4j to listen on every network interface.

Either the address or the port can be omitted, in which case the default for that part will be used. If the address is omitted, then the port must be preceeded with a colon (eg. :5001).

The port can also be configured as a range, like so: <hostname>:<first port>[-<second port>]. In this case, Neo4j will test each port in sequence, and select the first that is unused. Note that this usage is not permitted when the hostname is specified as 0.0.0.0 (the "all interfaces" address).

26.6.2. Getting started: Setting up a production cluster

Download and unpack Neo4j Enterprise

Download Neo4j Enterprise from the Neo4j download site, and unpack on 3 separate machines.

Configure HA related settings for each installation

The following settings should be configured for each Neo4j installation. Note that all 3 installations have the same configuration, except for the ha.server_id property.

Neo4j instance #1 — neo4j-01.local

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 1

# List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
# Alternatively, use IP addresses:
#ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties

# Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Neo4j instance #2 — neo4j-02.local

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 2

# List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
# Alternatively, use IP addresses:
#ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties

# Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Neo4j instance #3 — neo4j-03.local

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 3

# List of other known instances in this cluster
ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001
# Alternatively, use IP addresses:
#ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001

conf/neo4j-server.properties

# Let the webserver only listen on the specified IP.
org.neo4j.server.webserver.address=0.0.0.0

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Start the Neo4j Servers

Start the Neo4j servers as normal. Note the startup order does not matter.

neo4j-01$ ./bin/neo4j start
neo4j-02$ ./bin/neo4j start
neo4j-03$ ./bin/neo4j start
[Tip]Startup Time

When running in HA mode, the startup script returns immediately instead of waiting for the server to become available. This is because the instance does not accept any requests until a cluster has been formed. In the example above this happens when you startup the second instance. To keep track of the startup state you can follow the messages in console.log - the path to that is printed before the startup script returns.

Now, you should be able to access the 3 servers and check their HA status:

http://neo4j-01.local:7474/webadmin/#/info/org.neo4j/High%20Availability/

http://neo4j-02.local:7474/webadmin/#/info/org.neo4j/High%20Availability/

http://neo4j-03.local:7474/webadmin/#/info/org.neo4j/High%20Availability/

[Tip]Tip

You can replace database #3 with an arbiter instance, see Arbiter Instances.

That’s it! You now have a Neo4j HA cluster of 3 instances running. You can start by making a change on any instance and those changes will be propagated between them. For more HA related configuration options take a look at HA Configuration.

26.6.3. Alternative setup: Creating a local cluster for testing

If you want to start a cluster similar to the one described above, but for development and testing purposes, it is convenient to run all Neo4j instances on the same machine. This is easy to achieve, although it requires some additional configuration as the defaults will conflict with each other.

Download and unpack Neo4j Enterprise

Download Neo4j Enterprise from the Neo4j download site, and unpack into 3 separate directories on your test machine.

Configure HA related settings for each installation

The following settings should be configured for each Neo4j installation.

Neo4j instance #1 — ~/neo4j-01

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 1

# IP and port for this instance to bind to for communicating data with the
# other neo4j instances in the cluster.
ha.server = 127.0.0.1:6363
online_backup_server = 127.0.0.1:6366

# IP and port for this instance to bind to for communicating cluster information
# with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5001

# List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003

conf/neo4j-server.properties

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7474

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7484

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Neo4j instance #2 — ~/neo4j-02

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 2

# IP and port for this instance to bind to for communicating data with the
# other neo4j instances in the cluster.
ha.server = 127.0.0.1:6364
online_backup_server = 127.0.0.1:6367

# IP and port for this instance to bind to for communicating cluster information
# with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5002

# List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003

conf/neo4j-server.properties

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7475

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7485

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Neo4j instance #3 — ~/neo4j-03

conf/neo4j.properties:

# Unique server id for this Neo4j instance
# can not be negative id and must be unique
ha.server_id = 3

# IP and port for this instance to bind to for communicating data with the
# other neo4j instances in the cluster.
ha.server = 127.0.0.1:6365
online_backup_server = 127.0.0.1:6368

# IP and port for this instance to bind to for communicating cluster information
# with the other neo4j instances in the cluster.
ha.cluster_server = 127.0.0.1:5003

# List of other known instances in this cluster
ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003

conf/neo4j-server.properties

# http port (for all data, administrative, and UI access)
org.neo4j.server.webserver.port=7476

# https port (for all data, administrative, and UI access)
org.neo4j.server.webserver.https.port=7486

# HA - High Availability
# SINGLE - Single mode, default.
org.neo4j.server.database.mode=HA

Start the Neo4j Servers

Start the Neo4j servers as normal. Note the startup order does not matter.

localhost:~/neo4j-01$ ./bin/neo4j start
localhost:~/neo4j-02$ ./bin/neo4j start
localhost:~/neo4j-03$ ./bin/neo4j start

Now, you should be able to access the 3 servers and check their HA status:

http://127.0.0.1:7474/webadmin/#/info/org.neo4j/High%20Availability/

http://127.0.0.1:7475/webadmin/#/info/org.neo4j/High%20Availability/

http://127.0.0.1:7476/webadmin/#/info/org.neo4j/High%20Availability/