2.1. Caches in Neo4j

Prev		Next

2.1.1. File buffer cache
2.1.2. Object cache

Neo4j utilizes two different types of caches. A file buffer cache and an object cache. The file buffer cache caches the storage file data in the same format as it is stored on the durable storage media. The object cache caches the nodes, relationships and properties in a format that is optimized for high traversal speeds and transactional mutation.

2.1.1. File buffer cache

The file buffer cache caches the Neo4j data in the same format as it is represented on the durable storage media. The purpose of this cache layer is to improve both read and write performance. The file buffer cache improves write performance by writing to the cache and deferring durable write until the logical log is rotated. This behavior is safe since all transactions are always durably written to the logical log, which can be used to recover the store files in the event of a crash.

Since the operation of the cache is tightly related to the data it stores, a short description of the Neo4j durable representation format is necessary background. Neo4j stores data in multiple files and relies on the underlying file system to handle this efficiently. Each Neo4j storage file contains uniform fixed size records of a particular type:

Store file	Record size	Contents
nodestore	`9 B`	Nodes
relstore	`33 B`	Relationships
propstore	`25 B`	Properties for nodes and relationships
stringstore	`133 B`	Values of string properties
arraystore	`133 B`	Values of array properties

For strings and arrays, where data can be of variable length, data is stored in one or more 120B chunks, with 13B record overhead. The sizes of these blocks can actually be configured when the store is created using the string_block_size and array_block_size parameters. The size of each record type can also be used to calculate the storage requirements of a Neo4j graph or the appropriate cache size for each file buffer cache.

Neo4j uses multiple file buffer caches, one for each different storage file. Each file buffer cache divides its storage file into a number of equally sized windows. Each cache window contains an even number of storage records. The cache holds the most active cache windows in memory and tracks hit vs. miss ratio for the windows. When the hit ratio of an uncached window gets higher than the miss ratio of a cached window, the cached window gets evicted and the previously uncached window is cached instead.

Configuration

Parameter	Possible values	Effect
`use_memory_mapped_buffers`	`true` or `false`	If set to `true` Neo4j will use the operating systems memory mapping functionality for the file buffer cache windows. If set to `false` Neo4j will use its own buffer implementation. In this case the buffers will reside in the JVM heap which needs to be increased accordingly. The default value for this parameter is `true`, except on Windows.
`neostore.nodestore.db.mapped_memory`	The maximum amount of memory to use for memory mapped buffers for this file buffer cache. The default unit is `MiB`, for other units use any of the following suffixes: `B`, `k`, `M` or `G`.	The maximum amount of memory to use for the file buffer cache of the node storage file.
`neostore.relationshipstore.db.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the relationship store file.
`neostore.propertystore.db.index.keys.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the something-something file.
`neostore.propertystore.db.index.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the something-something file.
`neostore.propertystore.db.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the property storage file.
`neostore.propertystore.db.strings.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the string property storage file.
`neostore.propertystore.db.arrays.mapped_memory`		The maximum amount of memory to use for the file buffer cache of the array property storage file.
`string_block_size`	The number of bytes per block.	Specifies the block size for storing strings. This parameter is only honored when the store is created, otherwise it is ignored. Note that each character in a string occupies two bytes, meaning that a block size of 120 (the default size) will hold a 60 character long string before overflowing into a second block. Also note that each block carries an overhead of 13 bytes. This means that if the block size is 120, the size of the stored records will be 133 bytes.
`array_block_size`	The number of bytes per block.	Specifies the block size for storing arrays. This parameter is only honored when the store is created, otherwise it is ignored. The default block size is 120 bytes, and the overhead of each block is the same as for string blocks, i.e., 13 bytes.
`dump_configuration`	`true` or `false`	If set to `true` the current configuration settings will be written to the default system output, mostly the console or the logfiles.

When memory mapped buffers are used (use_memory_mapped_buffers = true) the heap size of the JVM must be smaller than the total available memory of the computer, minus the total amount of memory used for the buffers. When heap buffers are used (use_memory_mapped_buffers = false) the heap size of the JVM must be large enough to contain all the buffers, plus the runtime heap memory requirements of the application and the object cache.

When reading the configuration parameters on startup Neo4j will automatically configure the parameters that are not specified. The cache sizes will be configured based on the available memory on the computer, how much is used by the JVM heap, and how large the storage files are.

2.1.2. Object cache

The object cache caches individual nodes and relationships and their properties in a form that is optimized for fast traversal of the graph. The content of this cache are objects with a representation geared towards supporting the Neo4j object API and graph traversals. Reading from this cache is 5 to 10 times faster than reading from the file buffer cache. This cache is contained in the heap of the JVM and the size is adapted to the current amount of available heap memory.

Nodes and relationships are added to the object cache as soon as they are accessed. The cached objects are however populated lazily. The properties for a node or relationship are not loaded until properties are accessed for that node or relationship. String (and array) properties are not loaded until that particular property is accessed. The relationships for a particular node is also not loaded until the relationships are accessed for that node. Eviction from the cache happens in an LRU manner when the memory is needed.

Configuration

The main configuration parameter for the object cache is the cache_type parameter. This specifies which cache implementation to use for the object cache. The available cache types are:

`cache_type`	Description
`none`	Do not use a high level cache. No objects will be cached.
`soft`	Provides optimal utilization of the available memory. Suitable for high performance traversal. May run into GC issues under high load if the frequently accessed parts of the graph does not fit in the cache. This is the default cache implementation.
`weak`	Provides short life span for cached objects. Suitable for high throughput applications where a larger portion of the graph than what can fit into memory is frequently accessed.
`strong`	This cache will cache all data in the entire graph. It will never release memory held by the cache. Provides optimal performance if your graph is small enough to fit in memory.

Heap memory usage

This table can be used to calculate how much memory the data in the object cache will occupy on a 64bit JVM:

Object	Size	Comment
Node	`344 B`	Size for each node (not counting its relationships or properties).
	`48 B`	Object overhead.
	`136 B`	Property storage (ArrayMap `48B`, HashMap `88B`).
	`136 B`	Relationship storage (ArrayMap `48B`, HashMap `88B`).
	`24 B`	Location of first / next set of relationships.
Relationship	`208 B`	Size for each relationship (not counting its properties).
	`48 B`	Object overhead.
	`136 B`	Property storage (ArrayMap `48B`, HashMap `88B`).
Property	`116 B`	Size for each property of a node or relationship.
	`32 B`	Data element - allows for transactional modification and keeps track of on disk location.
	`48 B`	Entry in the hash table where it is stored.
	`12 B`	Space used in hash table, accounts for normal fill ratio.
	`24 B`	Property key index.
Relationships	`108 B`	Size for each relationship type for a node that has a relationship of that type.
	`48 B`	Collection of the relationships of this type.
	`48 B`	Entry in the hash table where it is stored.
	`12 B`	Space used in hash table, accounts for normal fill ratio.
Relationships	`8 B`	Space used by each relationship related to a particular node (both incoming and outgoing).
Primitive	`24 B`	Size of a primitive property value.
String	`64+B`	Size of a string property value. `64 + 2len(string) B` (64 bytes, plus two bytes for each character in the string).*