Wednesday, August 20, 2008

SolrNet with faceting support

I finally added support for facets in SolrNet. There are basically two kinds of facet queries:

  1. querying by field
  2. arbitrary facet queries

Querying by field

ISolrOperations<TestDocument> solr = ...
ISolrQueryResults<TestDocument> r = solr.Query("brand:samsung", new QueryOptions {
    FacetQueries = new ISolrFacetQuery[] {
        new SolrFacetFieldQuery("category")
    }
});

Yeah, kind of verbose, right? The DSL makes it shorter:

ISolrQueryResults<TestDocument> r = Solr.Query<TestDocument>().By("brand").Is("samsung").WithFacetField("id").Run();

To get the facet results, you get a property FacetFields in ISolrQueryResults<T> that is a IDictionary<string, ICollection<KeyValuePair<string, int>>>. The key of this dictionary is the facet field you have queried. The value is a collection of pairs where the key is the value found and the value is the count of ocurrences. Sounds complex? It's not. Let's see an example:

Let's assume that eBay used SolrNet to do its queries (please bear with me :-) ). Let's say a user enters the category Maps, Atlases & Globes, so you want the items within that category, as well as the item count on each subcategory ("Europe", "India", etc) that shows up as "Narrow your results". You could express such a query like this:

var results = Solr.Query<EbayItem>().By("category").Is("maps-atlases-globes")
    .WithFacetField("subcategory")
    .Run();

Now to print the subcategory count:

foreach (var facet in results.FacetFields["subcategory"]) {
    Console.WriteLine("{0}: {1}", facet.Key, facet.Value);
}

Which would print something like this:

United States (Pre-1900): 2123
Europe: 916
World Maps: 650
...

and so on. See? Told you it wasn't hard :-)

Note that by default, Solr orders facet field results by count (descending), which makes sense since most of the time you want the most populated/important terms first. If you want to override that:

ISolrQueryResults<EbayItem> results = Solr.Query<EbayItem>().By("category").Is("maps-atlases-globes")
    .WithFacetField("subcategory").DontSortByCount()
    .Run();

There are other options for facet field queries, I copied the docs from the official Solr documentation.

Arbitrary facet queries

Support for arbitrary queries is not very nice at the moment, but it works:

var priceLessThan500 = "price:[* TO 500]";
var priceMoreThan500 = "price:[500 TO *]";
var results = Solr.Query<TestDocument>().By("category").Is("maps-atlases-globes")
    .WithFacetQuery(priceLessThan500)
    .WithFacetQuery(priceMoreThan500)
    .Run();
Then results.FacetQueries[priceLessThan500] and results.FacetQueries[priceMoreThan500] get you the respective result counts.

Code is hosted at googlecode