Class AnalyzerProvider

java.lang.Object
org.neo4j.graphdb.schema.AnalyzerProvider
All Implemented Interfaces:
NamedService

public abstract class AnalyzerProvider extends Object implements NamedService
This is the base-class for all service-loadable factory classes, that build the Lucene Analyzer instances that are available to the fulltext schema index. The analyzer factory is referenced in the index configuration via its analyzerName and alternativeNames that are specific to the constructor of this base class. Sub-classes must have a public no-arg constructor such that they can be service-loaded.

Here is an example that implements an analyzer provider for the SwedishAnalyzer that comes built into Lucene:


 public class Swedish extends AnalyzerProvider
 {
     public Swedish()
     {
         super( "swedish" );
     }

     public Analyzer createAnalyzer()
     {
         return new SwedishAnalyzer();
     }
 }
 

  • Method Summary

    Modifier and Type
    Method
    Description
    static org.apache.lucene.analysis.CharArraySet
    cleanStopWordSet(org.apache.lucene.analysis.CharArraySet stopSet)
    Produce a new stop-word set similar to the given set, but where unclean elements have been removed.
    abstract org.apache.lucene.analysis.Analyzer
     
     
     
     
  • Method Details

    • getName

      public String getName()
    • createAnalyzer

      public abstract org.apache.lucene.analysis.Analyzer createAnalyzer()
      Returns:
      A newly constructed Analyzer instance.
    • description

      public String description()
      Returns:
      A description of this analyzer.
    • stopwords

      public List<String> stopwords()
    • cleanStopWordSet

      public static org.apache.lucene.analysis.CharArraySet cleanStopWordSet(org.apache.lucene.analysis.CharArraySet stopSet)
      Produce a new stop-word set similar to the given set, but where unclean elements have been removed. Stop-word list files often contain comments, blank lines, excess white-space, etc. When these files are parsed, these unclean data artifacts can end up in our stop-word sets when they should not. The passed-in stop-word set is not changed.
      Parameters:
      stopSet - The stop-word set to clean up.
      Returns:
      the cleaned-up stop-word set.