Distributed Lucene Facets
Note
This feature is available for NCache Enterprise edition only.
Distributed Lucene supports all of the facet types supported by Lucene. These facets are categorized into the following three types:
- Taxonomy Based Facets: These facets are defined using a hierarchy of categories, known as taxonomy.
- Range Based Facets: These facets are based on ranges into which a
Numeric
orDateTime
field falls. - Sorted Set Facets: These facets do not require a separate taxonomy index and compute counts based on sorted set doc values fields instead.
These facet types are explained below with respective code examples.
Prerequisites
- Make sure that you have created and started a Lucene cache through the NCache Web Manager or PowerShell.
- Install the Lucene.Net.Facet.NCache NuGet package in your application by executing the following command in the Package Manager Console:
Install-Package Lucene.Net.Facet.NCache
- Make sure that the Cache is already running.
- Make sure that your application is not using any native Lucene DLL/Reference.
- To ensure the operation is fail-safe, it is recommended to handle any potential exceptions within your application, as explained in Handling Failures.
- To handle any unseen exceptions, refer to the Troubleshooting section.
Taxonomy Based Facets
The DirectoryTaxonomyWriter
class allows you to create and open a taxonomy writer inside a taxonomy directory, and the DirectoryTaxonomyReader
class allows you to open a taxonomy reader on a taxonomy writer.
Note
Distributed Lucene supports all the taxonomy-based facets supported by Lucene.
In the following example, facets are created with the FastTaxonomoyFacetCounts
class. They are then configured, added to documents in the form of facet fields. Later results are retrieved and verified with the help of the FacetResult
class.
FacetsCollector facetCollector = null;
Facets facets = null;
DirectoryTaxonomyWriter taxoWriter = null;
Directory directory = null;
Directory taxoDir = null;
DirectoryTaxonomyReader taxoReader = null;
try
{
// Pre-Requisite: Lucene cache is already connected
string index = "luceneIndex";
string taxoIndex = "taxonomyIndex";
// Open NCache directory
directory = NCacheDirectory.Open(cache, index);
// Open taxonomy directory
taxoDir = NCacheDirectory.Open(cache, taxoIndex);
// Initialize taxonomy writer
taxoWriter = new DirectoryTaxonomyWriter(taxoDir, OpenMode.CREATE);
// Configure index writer and initialize it
var indexWriterConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, new StandardAnalyzer(LuceneVersion.LUCENE_48));
IndexWriter writer = new IndexWriter(directory, indexWriterConfig);
// Configure facets
FacetsConfig config = new FacetsConfig(cache);
config.SetHierarchical("Publish Date", true);
// Initialize document and add facet fields to it
Document document = new Document();
document.Add(new FacetField("Author", "Bob"));
document.Add(new FacetField("Publish Date", "2010", "10", "15"));
// Add document with facets config and taxonomy writer
writer.AddDocument(document, config, taxoWriter);
// Initialize index searcher
IndexSearcher indexSearcher = new IndexSearcher(writer.GetReader(true));
// Initialize taxonomy reader
taxoReader = new DirectoryTaxonomyReader(taxoWriter);
// Initialize facets collector
facetCollector = new FacetsCollector(false, cache);
// Search docs that match query and collect facets with facets collector
indexSearcher.Search(new MatchAllDocsQuery(), facetCollector);
// Initialize facets
facets = new FastTaxonomyFacetCounts(taxoReader, config, facetCollector);
// Retrieve & verify results:
FacetResult facetResult = facets.GetTopChildren(10, "Publish Date");
float Val = facets.GetSpecificValue("Publish Date");
IList<FacetResult> facetResults = facets.GetAllDims(10);
}
catch (Exception ex)
{
// Handle exceptions
}
finally
{
// Dispose all instances
facetCollector?.Dispose();
facets?.Dispose();
taxoWriter?.Dispose();
taxoReader?.Dispose();
taxoDir?.Dispose();
directory?.Dispose();
}
Warning
Not disposing the Facets
, FacetCollector
, NCacheDirectory
, IndexReader
, IndexWriter
, and DirectoryTaxonomyReader
will lead to extra memory consumption on the server-side.
Range Based Facets
In the following example, numeric facets are created using the Int64RangeFacetCounts
class. These facets are then added to the document via the IndexWriter
. Documents are searched on the basis of these facets, and results are verified afterwards.
Note
Distributed Lucene supports all the range-based facets supported by Lucene.
Directory directory = null;
IndexWriter writer = null;
IndexReader reader = null;
FacetsCollector facetCollector = null;
IndexSearcher indexSearcher = null;
Facets facets = null;
try
{
// Pre-Requisite: Lucene cache is already connected
string index = "luceneIndex";
// Open NCache directory
directory = NCacheDirectory.Open(cache, index);
// Configure index writer and initialize it
var indexWriterConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, new StandardAnalyzer(LuceneVersion.LUCENE_48));
writer = new IndexWriter(directory, indexWriterConfig);
// Create a document
Document document = new Document();
// Create and add numeric fields to the document
NumericDocValuesField field = new NumericDocValuesField("field", 0L);
document.Add(field);
for (long l = 0; l < 100; l++)
{
field.SetInt64Value(l);
writer.AddDocument(document);
}
// Also add Long.MAX_VALUE
field.SetInt64Value(long.MaxValue);
writer.AddDocument(document);
// Initialize index reader
reader = writer.GetReader(true);
// Initialize facets collector
facetCollector = new FacetsCollector(false, cache);
// Initialize index searcher
indexSearcher = new IndexSearcher(reader);
// Search docs that match query and collect facets with facets collector
indexSearcher.Search(new MatchAllDocsQuery(), facetCollector);
// Intialize facets
facets = new Int64RangeFacetCounts("field", facetCollector, new Int64Range("less than 10", 0L, true, 10L, false), new Int64Range("less than or equal to 10", 0L, true, 10L, true), new Int64Range("over 90", 90L, false, 100L, false), new Int64Range("90 or above", 90L, true, 100L, false), new Int64Range("over 1000", 1000L, false, long.MaxValue, true));
// Retrieve & verify results:
FacetResult facetResult = facets.GetTopChildren(10, "field");
}
catch (Exception ex)
{
// Handle exceptions
}
finally
{
// Dispose all instances
facets?.Dispose();
indexSearcher?.Dispose();
facetCollector?.Dispose();
reader?.Dispose();
writer?.Dispose();
directory?.Dispose();
}
Sorted Set Facets
In the following example, Sorted Set Facets are created with the SortedSetDocValuesFacetCounts
class. These facets are then added to the document, then searched, and later results are verified.
Directory directory = null;
IndexWriter writer = null;
IndexSearcher indexSearcher = null;
FacetsCollector facetCollector = null;
SortedSetDocValuesFacetCounts facets = null;
try
{
// Pre-Requisite: Lucene cache is already connected
string index = "normalIndex";
// Open NCache directory
directory = NCacheDirectory.Open(cache, index);
// Initialize facets config
FacetsConfig config = new FacetsConfig(cache);
config.SetMultiValued("a", true);
// Initialize index writer
var indexWriterConfig = new IndexWriterConfig(LuceneVersion.LUCENE_48, new StandardAnalyzer(LuceneVersion.LUCENE_48));
writer = new IndexWriter(directory, indexWriterConfig);
// Create a document and add fields
Document document = new Document();
document.Add(new SortedSetDocValuesFacetField("a", "foo"));
document.Add(new SortedSetDocValuesFacetField("b", "foo"));
// Add sorted set facet field to document with facet config
writer.AddDocument(document, config);
// Commit writer
writer.Commit();
// Initialize index searcher
indexSearcher = new IndexSearcher(writer.GetReader(true));
// Per-top-reader state:
SortedSetDocValuesReaderState state = new DefaultSortedSetDocValuesReaderState(indexSearcher.IndexReader);
// Initialize facets collector
facetCollector = new FacetsCollector(false, cache);
// Searcher docs that match query and collect facets with facet collector
indexSearcher.Search(new MatchAllDocsQuery(), facetCollector);
facets = new SortedSetDocValuesFacetCounts(state, facetCollector);
// Retrieve & verify results:
FacetResult facetResult = facets.GetTopChildren(10, "a");
IList<FacetResult> facetResults = facets.GetAllDims(10);
}
catch (Exception ex)
{
// Handle exceptions
}
finally
{
// Dipose all instances
facets?.Dispose();
facetCollector?.Dispose();
indexSearcher?.Dispose();
writer?.Dispose();
directory?.Dispose();
}
Additional Resources
NCache also provides a sample application for Facets in Distributed Lucene on GitHub
See Also
Distributed Lucene Overview
Distributed Lucene Geo-Spatial API
Searching in Distributed Lucene
Distributed Lucene Indexing