Introduction to Sitecore Solr eDismax

Posted by

Introduction

Search is hard. Gathering requirements, user stories, implementation and testing – all challenging. Its difficult to get meaningful and relevant search results for keyword-based searches.

  • How do you handle multi-phrase searches?
  • How do you handle stemming? Synonyms? Plural words?
  • How do you implement a spell check? How about spell check for individual words in a multi-phrase search?
  • How do you build a type-ahead/auto complete?
  • How do you highlight the search term? What if the word is pluralized?
  • How do you boost or tune your results?
  • How do you debug a Solr query? How do you explain why result #1 appeared before result #2 to a client? Do you guess?

Solr offers a very powerful and robust tool set out-of-the-box that provides solutions to these questions via the DisMax / eDisMax query parser.

As it turns out, these features have been available for years. Unfortunately Sitecore hasn’t offered native support to access these features, which is why most Sitecore developers do not know about them.

The good news is now official support for eDisMax (and for spellcheck, highlighting) is now available in Sitecore 9.

Blog Series Overview

What is Solr DisMax?

  • “The Solr DisMax query parser interface is more like that of Google than the interface of the ‘standard’ Solr request handler.”
  • This similarity to Google makes DisMax the appropriate query parser for many consumer applications.
  • It accepts a simple syntax, and it rarely produces error messages.
  • The DisMax query parser supports an extremely simplified subset of the Lucene QueryParser syntax
  • Full Documentation:https://lucene.apache.org/solr/guide/6_6/the-dismax-query-parser.html

What is Solr eDisMax?

eDisMax Quick Start

  1. In the Solr admin panel, select the “sitename_web_index” index and click the “query” tab.
  2. Check the “edismax” checkbox.
  3. Enter a search phrase in the q field. Do not enter field names, operators, etc, just the actual search keyword / phrase. (notice how simple/clean this is compared to a Lucene query already?
  4. Enter the field names to search in the qf (query) and pf (phrase) fields separated by spaces.
    1. e.g. “title_t description_t”
  5. Optionally enter “100%” in the mm (minimum match field).
  6. Submit

When to use DisMax vs eDisMax?

Easy answer: Just use eDisMax. eDisMax does everything DisMax does but better. So for the rest of this blog series, I will only talk about eDisMax.

Benefits for Using eDisMax

  • Handles multi-word search queries.
    • User enters “Intellectual Property Litigation”. eDisMax breaks this down into queries for the following, with priority on a higher % of matching words:
      • “Intellectual Property Litigation”
      • “Intellectual Property”
      • “Property Litigation”
      • “Intellectual Property”
      • “ Property”
      • “ Litigation”
  • Simpler, natural, more google like query syntax.
  • Easier to debug.
  • Prevents unintended relevancy on query terms.
  • Provides results more in line with user search expectations.

Should I use eDisMax instead of Lucene?

Lucene is the default query parser of Solr and also the query engine you are using when you are using Sitecore’s Content Search API IQueryable.

  • YES – If you are NOT using eDisMax for a keyword-based search, you are doing it wrong.
  • For non-keyword search, eDisMax is still a viable option, but it is a matter of preference.
    • The Sitecore Linq IQueryable API might be faster to build your feature with than the equivalent eDisMax query (but it will be harder to debug).
  • In my experience, I found it much easier to debug eDisMax queries than Lucene queries.

Sitecore SolrNet Support

  • Sitecore content search API did not support for e/DisMax in Sitecore < 9.
  • Starting with 9.0 +, Sitecore offers new support to use SolrNet APIs directly via SolrNet, which enables you to use eDisMax out of the box.
  • Sitecore < 9: I developed a .NET web api app that enables eDisMax Support in any Sitecore site if upgrading to 9 is not an option.
    • Does not use Sitecore binarys – no licensing concerns.
    • Coming soon to my github page!

OK, so how do I use eDisMax parser in Sitecore 9?

  • Use Query<T>() instead of GetQueryable<>()
  • Pass eDisMax parameters into the ExtraParams property. (This is a dictionary of key value pairs:)
  • Query() also provides easy access to spellchecking, highlighting, faceting, query debugging info, in the same Solr request as the search (some require configuration, more in this in future posts)
using (IProviderSearchContext ctx = BuildSearchIndex(query).CreateSearchContext())
{
   SolrQueryResults<T> results = ctx.Query<T>(q, new SolrNet.Commands.Parameters.QueryOptions
   {
    Rows = query.Rows,     
       StartOrCursor = new StartOrCursor.Start(pageNumber),
       ExtraParams = extraParams,
       FilterQueries = filters
    });
  return results;
}

Demo

I created a bare-bones demo site that demonstrates most of the features I will be covering in the blog series.
https://github.com/WeekendWarrior/Sitecore-Solr-Edismax-Search-Demo

Requirements

  1. Clean local installation of Sitecore 9.1
  2. TDS

I will be expanding and updating the instructions and demo in the coming weeks. If you have any questions for now please contact me, and stay tuned for future posts in my blog series.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s