Lucene supports smart indexing of numbers, querying for ranges and sorting such results, and so does its backend for Neo4j. To mark a value so that it is indexed as a numeric value, we can make use of the ValueContext class, like this:
movies.add( theMatrix, "year-numeric", new ValueContext( 1999L ).indexNumeric() ); movies.add( theMatrixReloaded, "year-numeric", new ValueContext( 2003L ).indexNumeric() ); // Query for range long startYear = 1997; long endYear = 2001; hits = movies.query( NumericRangeQuery.newLongRange( "year-numeric", startYear, endYear, true, true ) );
Note | |
---|---|
Values that are indexed numerically must be queried using NumericRangeQuery. |
Lucene performs sorting very well, and that is also exposed in the index backend, through the QueryContext class:
hits = movies.query( "title", new QueryContext( "*" ).sort( "title" ) ); for ( Node hit : hits ) { // all movies with a title in the index, ordered by title } // or hits = movies.query( new QueryContext( "title:*" ).sort( "year", "title" ) ); for ( Node hit : hits ) { // all movies with a title in the index, ordered by year, then title }
We sort the results by relevance (score) like this:
hits = movies.query( "title", new QueryContext( "The*" ).sortByScore() ); for ( Node movie : hits ) { // hits sorted by relevance (score) }
Instead of passing in Lucene query syntax queries, you can instantiate such queries programmatically and pass in as argument, for example:
// a TermQuery will give exact matches Node actor = actors.query( new TermQuery( new Term( "name", "Keanu Reeves" ) ) ).getSingle();
Note that the TermQuery is basically the same thing as using the get
method on the index.
This is how to perform wildcard searches using Lucene Query Objects:
hits = movies.query( new WildcardQuery( new Term( "title", "The Matrix*" ) ) ); for ( Node movie : hits ) { System.out.println( movie.getProperty( "title" ) ); }
Note that this allows for whitespace in the search string.
Lucene supports querying for multiple terms in the same query, like so:
hits = movies.query( "title:*Matrix* AND year:1999" );
Caution | |
---|---|
Compound queries can’t search across committed index entries and those who haven’t got committed yet at the same time. |
The default operator (that is whether AND
or OR
is used in between different terms) in a query is OR
. Changing that behavior is also done via the QueryContext class:
QueryContext query = new QueryContext( "title:*Matrix* year:1999" ).defaultOperator( Operator.AND ); hits = movies.query( query );
If your index lookups becomes a performance bottle neck, caching can be enabled for certain keys in certain indices (key locations) to speed up get requests. The caching is implemented with an LRU cache so that only the most recently accessed results are cached (with "results" meaning a query result of a get request, not a single entity). You can control the size of the cache (the maximum number of results) per index key.
Index<Node> index = graphDb.index().forNodes( "actors" ); ( (LuceneIndex<Node>) index ).setCacheCapacity( "name", 300000 );
Caution | |
---|---|
This setting is not persisted after shutting down the database. This means: set this value after each startup of the database if you want to keep it. |
Copyright © 2011 Neo Technology