15.2. Data Integrity

In order to keep data consistent, there needs to be mechanisms and structures that guarantee the integrity of all stored data. In Neo4j, data integrity is maintained for the core graph engine together with other data sources - see below.

Core Graph Engine

In Neo4j, the whole data model is stored as a graph on disk and persisted as part of every committed transaction. In the storage layer, Relationships, Nodes, and Properties have direct pointers to each other. This maintains integrity without the need for data duplication between the different backend store files.

Different Data Sources

In a number of scenarios, the core graph engine is combined with other systems in order to achieve optimal performance for non-graph lookups. For example, Apache Lucene is frequently used as an additional index system for text queries that would otherwise be very processing-intensive in the graph layer.

To keep these external systems in synchronization with each other, Neo4j provides full Two Phase Commit transaction management, with rollback support over all data sources. Thus, failed index insertions into Lucene can be transparently rolled back in all data sources and thus keep data up-to-date.