This guide will help you understand how to configure and deploy a Neo4j High Availability cluster. Two scenarios will be considered:
Each instance in a Neo4j HA cluster must be assigned an integer ID, which serves as its unique identifier. At startup, a Neo4j
instance contacts the other instances specified in the ha.initial_hosts
configuration option.
When an instance establishes a connection to any other, it determines the current state of the cluster and ensures that it is eligible to join. To be eligible the Neo4j instance must host the same database store as other members of the cluster (although it is allowed to be in an older state), or be a new deployment without a database store.
Explicitly configure IP Addresses/Hostnames for a cluster | |
---|---|
Neo4j will attempt to configure IP addresses for itself in the absence of explicit configuration. However in typical operational environments where machines have multiple network cards and support IPv4 and IPv6 it is strongly recommended that the operator explicitly sets the IP address/hostname configuration for each machine in the cluster. |
Let’s examine the available settings and the values they accept.
ha.server_id
is the cluster identifier for each instance. It must be a positive integer and must be unique among
all Neo4j instances in the cluster.
For example, ha.server_id=1
.
ha.cluster_server
is an address/port setting that specifies where the Neo4j instance will listen for cluster
communications (like hearbeat messages). The default port is 5001
. In the absence of a specified IP address, Neo4j
will attempt to find a valid interface for binding. While this behavior typically results in a well-behaved server, it
is strongly recommended that users explicitly choose an IP address bound to the network interface of their choosing
to ensure a coherent cluster deployment.
For example, ha.cluster_server=192.168.33.22:5001
will listen for cluster communications on the network interface
bound to the 192.168.33.0 subnet on port 5001.
ha.initial_hosts
is a comma separated list of address/port pairs, which specify how to reach other Neo4j instances
in the cluster (as configured via their ha.cluster_server
option). These hostname/ports will be used when the Neo4j
instances starts, to allow it up to find and join the cluster. Specifying an instance’s own address is permitted.
Warning | |
---|---|
Do not use any whitespace in this configuration option. |
For example, ha.initial_hosts=192.168.33.22:5001,192.168.33.21:5001
will attempt to reach Neo4j instances listening on
192.168.33.22 on port 5001 and 192.168.33.21 on port 5001 on the 192.168.33.0 subnet.
ha.server
is an address/port setting that specifies where the Neo4j instance will listen for transactions
(changes to the graph data) from the cluster master. The default port is 6001
. In the absence of a specified IP address, Neo4j will attempt
to find a valid interface for binding. While this behavior typically results in a well-behaved server, it is strongly recommended that
users explicitly choose an IP address bound to the network interface of their choosing to ensure a coherent cluster topology.
ha.server
must user a different port to ha.cluster_server
.
For example, ha.server=192.168.33.22:6001
will listen for cluster communications on the network interface
bound to the 192.168.33.0 subnet on port 6001.
Address/port format | |
---|---|
The For For Either the address or the port can be omitted, in which case the default for that part will be used. If the address
is omitted, then the port must be preceded with a colon (eg. The syntax for setting the port range is: |
Download Neo4j Enterprise from the Neo4j download site, and unpack on 3 separate machines.
The following settings should be configured for each Neo4j installation.
Note that all 3 installations have the same configuration, except for the ha.server_id
property.
Neo4j instance #1 — neo4j-01.local
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 1 # List of other known instances in this cluster ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001 # Alternatively, use IP addresses: #ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001
conf/neo4j-server.properties
# Let the webserver only listen on the specified IP. org.neo4j.server.webserver.address=0.0.0.0 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Neo4j instance #2 — neo4j-02.local
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 2 # List of other known instances in this cluster ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001 # Alternatively, use IP addresses: #ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001
conf/neo4j-server.properties
# Let the webserver only listen on the specified IP. org.neo4j.server.webserver.address=0.0.0.0 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Neo4j instance #3 — neo4j-03.local
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 3 # List of other known instances in this cluster ha.initial_hosts = neo4j-01.local:5001,neo4j-02.local:5001,neo4j-03.local:5001 # Alternatively, use IP addresses: #ha.initial_hosts = 192.168.0.20:5001,192.168.0.21:5001,192.168.0.22:5001
conf/neo4j-server.properties
# Let the webserver only listen on the specified IP. org.neo4j.server.webserver.address=0.0.0.0 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Start the Neo4j servers as normal. Note the startup order does not matter.
neo4j-01$ ./bin/neo4j start
neo4j-02$ ./bin/neo4j start
neo4j-03$ ./bin/neo4j start
Startup Time | |
---|---|
When running in HA mode, the startup script returns immediately instead of waiting for the server to become available. This is because the instance does not accept any requests until a cluster has been formed. In the example above this happens when you startup the second instance. To keep track of the startup state you can follow the messages in console.log - the path to that is printed before the startup script returns. |
Now, you should be able to access the 3 servers and check their HA status:
http://neo4j-01.local:7474/webadmin/#/info/org.neo4j/High%20Availability/
http://neo4j-02.local:7474/webadmin/#/info/org.neo4j/High%20Availability/
http://neo4j-03.local:7474/webadmin/#/info/org.neo4j/High%20Availability/
Tip | |
---|---|
You can replace database #3 with an arbiter instance, see Arbiter Instances. |
That’s it! You now have a Neo4j HA cluster of 3 instances running. You can start by making a change on any instance and those changes will be propagated between them. For more HA related configuration options take a look at HA Configuration.
If you want to start a cluster similar to the one described above, but for development and testing purposes, it is convenient to run all Neo4j instances on the same machine. This is easy to achieve, although it requires some additional configuration as the defaults will conflict with each other.
Download Neo4j Enterprise from the Neo4j download site, and unpack into 3 separate directories on your test machine.
The following settings should be configured for each Neo4j installation.
Neo4j instance #1 — ~/neo4j-01
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 1 # IP and port for this instance to bind to for communicating data with the # other neo4j instances in the cluster. ha.server = 127.0.0.1:6363 online_backup_server = 127.0.0.1:6366 # IP and port for this instance to bind to for communicating cluster information # with the other neo4j instances in the cluster. ha.cluster_server = 127.0.0.1:5001 # List of other known instances in this cluster ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
conf/neo4j-server.properties
# http port (for all data, administrative, and UI access) org.neo4j.server.webserver.port=7474 # https port (for all data, administrative, and UI access) org.neo4j.server.webserver.https.port=7484 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Neo4j instance #2 — ~/neo4j-02
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 2 # IP and port for this instance to bind to for communicating data with the # other neo4j instances in the cluster. ha.server = 127.0.0.1:6364 online_backup_server = 127.0.0.1:6367 # IP and port for this instance to bind to for communicating cluster information # with the other neo4j instances in the cluster. ha.cluster_server = 127.0.0.1:5002 # List of other known instances in this cluster ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
conf/neo4j-server.properties
# http port (for all data, administrative, and UI access) org.neo4j.server.webserver.port=7475 # https port (for all data, administrative, and UI access) org.neo4j.server.webserver.https.port=7485 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Neo4j instance #3 — ~/neo4j-03
conf/neo4j.properties
:
# Unique server id for this Neo4j instance # can not be negative id and must be unique ha.server_id = 3 # IP and port for this instance to bind to for communicating data with the # other neo4j instances in the cluster. ha.server = 127.0.0.1:6365 online_backup_server = 127.0.0.1:6368 # IP and port for this instance to bind to for communicating cluster information # with the other neo4j instances in the cluster. ha.cluster_server = 127.0.0.1:5003 # List of other known instances in this cluster ha.initial_hosts = 127.0.0.1:5001,127.0.0.1:5002,127.0.0.1:5003
conf/neo4j-server.properties
# http port (for all data, administrative, and UI access) org.neo4j.server.webserver.port=7476 # https port (for all data, administrative, and UI access) org.neo4j.server.webserver.https.port=7486 # HA - High Availability # SINGLE - Single mode, default. org.neo4j.server.database.mode=HA
Start the Neo4j servers as normal. Note the startup order does not matter.
localhost:~/neo4j-01$ ./bin/neo4j start
localhost:~/neo4j-02$ ./bin/neo4j start
localhost:~/neo4j-03$ ./bin/neo4j start
Now, you should be able to access the 3 servers and check their HA status:
http://127.0.0.1:7474/webadmin/#/info/org.neo4j/High%20Availability/
http://127.0.0.1:7475/webadmin/#/info/org.neo4j/High%20Availability/
http://127.0.0.1:7476/webadmin/#/info/org.neo4j/High%20Availability/
Copyright © 2014 Neo Technology