Common Rules Rewriter

The Common Rules Rewriter uses configurable rules to manipulate the matching and ranking of search results depending on the input query. In e-commerce search it is a powerful tool for merchandisers to fine-tune search results, especially for high-traffic queries.

The rule definition format is the same for Solr and Elasticsearch/OpenSearch with two exceptions:

  • Filters and boostings can optionally be expressed in the syntax of the search engine instead of in the generic Querqy syntax.

  • Very few features are not available in Elasticsearch/OpenSearch (yet) which will be mentioned at the feature.

Configuring rules

The rules for the ‘Common Rules Rewriter’ are passed as the value of the rules element when you create a configuration with the SimpleCommonRulesRewriterFactory in Elasticsearch/OpenSearch.

PUT  /_querqy/rewriter/common_rules

2    "class": "querqy.elasticsearch.rewriter.SimpleCommonRulesRewriterFactory",
3    "config": {
4        "rules" : "notebook =>\nSYNONYM: laptop"
5    }


OpenSearch users: Simply replace package name elasticsearch with opensearch in rewriter configurations.

Structure of a rule

We will introduce some terminology and explain the rule structure using the follow example:

 1# if the input contains 'personal computer', add two synonyms, 'pc' and
 2# 'desktop computer', and rank down by factor 50 documents that
 3# match 'software':
 5personal computer =>
 6  SYNONYM: pc
 7    SYNONYM: desktop computer
 8  DOWN(50): software
 9  @_id: "ID1"
10  @enabled: true
11  @{
12    priority: 100,
13    tenant: ["t1",   "t2", "t3"],
14  }@

Each rule must have an input definition (line #5). A rule will only be applied if the query matches the input expression. An input definition starts at a new line and it ends at => followed by a line break.

An input definition is followed by one or more instructions (#6 to #8), which define how the query should be rewritten. Note, that the number of whitespaces in front of an instruction does not have any contextual significance.

An instruction must have a predicate (SYNONYM, UP/DOWN, FILTER, DELETE, DECORATE). The predicate can be followed by a colon and some arguments after the colon - the right-hand side.

The right-hand side expression contains tokens or subqueries that are to be added to or to be removed from the query. For example, line #7 defines that the synonym ‘desktop computer’ should be added to the query. Some predicates allow to specify additional arguments in brackets, like the boost factor 50 in DOWN(50) (#8).

Lines #9 to #14 define optional rule properties. They can be used for sorting and filtering rules. For example: if multiple rules match the query, only apply the rule with the highest priority value. Properties must be defined at the end of a rule. They start with a @ sign, either followed by a single-line property (lines #9 und #10), or by a multi-line JSON document (#11 to #14) that must be terminated with another @.

Input matching

1personal computer =>
2  SYNONYM: pc
3  SYNONYM: desktop computer
4  DOWN(50): software

Querqy applies the above rule if it can find the matching criteria ‘personal computer’ anywhere in the query, provided that there is no other term between ‘personal’ and ‘computer’. It would thus also match the input ‘cheap personal computer’. If you want to match the input exactly, or at the beginning or end of the input, you have to mark the input boundaries using double quotation marks:

 1# only match the exact query 'personal computer'.
 2"personal computer" =>
 3    ....
 5# only match queries starting with 'personal computer'
 6"personal computer =>
 7    ....
 9# only match queries ending with 'personal computer'
10personal computer" =>
11    ....

Each input token is matched exactly except that input matching is case-insensitive. (You can make it case-sensitive in the configuration).

There is no stemming or fuzzy matching applied to the input. If you want to make ‘pc’ a synonym for both, ‘personal computer’ and ‘personal computers’, you will have to declare two rules:

1personal computer =>
2  SYNONYM: pc
4personal computers =>
5  SYNONYM: pc

You can use a wildcard at the very end of the input declaration:

1sofa* =>
2  SYNONYM: sofa $1

The above rule matches if the input contains a token that starts with ‘sofa-’ and adds a synonym ‘sofa + wildcard matching string’ to the query. For example, a user query ‘sofabed’ would yield the synonym ‘sofa bed’.

The wildcard matches 1 (!) or more characters. It is not intended as a replacement for stemming but to provide some support for decompounding in languages like German where compounding is very productive. For example, compounds of the structure ‘material + product type’ and ‘intended audience + product type’ are very common in German. Wildcards in Querqy can help to decompound them and allow to search the components accross multiple fields:

1# match queries like 'kinderschuhe' (= kids' shoes) and
2# 'kinderjacke' (= kids' jacket) and search for
3# 'kinder schuhe'/'kinder jacke' etc. in all search fields
4kinder* =>
5  SYNONYM: kinder $1

Wildcard matching can be used for all rule types. There are some restrictions in the current wildcard implementation, which might be removed in the future:

  • Synonyms and boostings (UP/DOWN) are the only rule types that can pick up the ‘$1’ placeholder.

  • The wildcard can only occur at the very end of the input matching.

  • It cannot be combined with the right-hand input boundary marker (…”).

Querqy 5

So far, input matching requires that all specified input terms match the query. Starting with version 5, Querqy adds the option to express more complex boolean semantics for input matching.

You will have to enable boolean input matching in the rewriter configuration by setting the allowBooleanInput to true:

2  "class": "querqy.solr.rewriter.commonrules.CommonRulesRewriterFactory",
3  "config": {
4      "rules" : "notebook AND NOT sleeve =>\nDOWN(100): accessories",
5      "allowBooleanInput": true
7  }

You can then use the operators AND, OR, NOT and ( ) to express boolean matching. In the following example, both rules require to see smartphone in the query. The first rule is only triggered if the query does not contain case, while the second rule requires case or cover to occur in the query.

1smartphone AND NOT case =>
2  FILTER: * category:smartphones
4smartphone AND (case OR cover) =>
5  FILTER: * category:"smartphone cases"

If you use wildcard matching, the wildcard can occur at the end of any boolean sub-group (like in smart* AND (mobile app*)). However, a boolean input expression that contains a wildcard cannot be combined with SYNONYM or DELETE instructions.


Querqy gives you a powerful toolset for using synonyms at query time.

As opposed to the solutions that exist in Elasticsearch/OpenSearch and Solr, it does not use a Lucene TokenFilter for synonyms but relies purely on query rewriting. This makes matching multi-term input and adding multi-term synonyms work flawlessly. Querqy can cope with applying multiple synonym rules at the same time, even if they have overlapping multi-token inputs. In addition it avoids issues with scoring that are related to different document frequencies of the original input and synonym terms. Last but not least, Querqy also allows to configure synonym rules in a field-independent manner, making the maintenance of synonyms a lot more intuitive than in Elasticsearch/OpenSearch or Solr.

You have already seen rules for synonyms:

1personal computer =>
2  SYNONYM: pc
4sofa* =>
5  SYNONYM: sofa $1

Synonyms work in only one direction in Querqy. It always tries to match the input that is specified in the rule and adds a synonym if a given user query matches this input. If you need bi-directional synonyms or synonym groups, you have to declare a rule for each direction. For example, if the query ‘personal computer’ should also search for ‘pc’ while query ‘pc’ should also search for ‘personal computer’, you would write these two rules:

1personal computer =>
2  SYNONYM: pc
4pc =>
5        SYNONYM: personal computer

Weighted synonyms

Synonyms can be configured to have a term weight. A term weight has to be greater than (or equal to) 0. Defining the term weight is optional. By default synonyms have a term weight of 1.0.

1cutlery =>
2  SYNONYM(0.5): fork
3  SYNONYM(0.5): knife
5notebook =>
6  SYNONYM: laptop
7  SYNONYM(0.5): macbook
8  SYNONYM(0.5): chromebook

At query time, the term weight is multiplied with the field boost of the queried field. This helps to formulate queries for synonyms that mimic subtopic synonyms. These are linguistical unequal synonyms and include terms that should appear in a search result but with a lower score than linguistical equal synonyms. They will increase recall and keep the exact matches on top of the search result.

Given the examples above, if you search for cutlery and define qf=title^3 as query field boost, the following dismax query is issued:

1boolean_query (mm=1) (
2        dismax('title:cutlery^3'),
3        dismax('title:fork^1.5'),
4        dismax('title:knife^1.5')

In e-commerce search this can be used to handle umbrella term searches (like cutlery or gardening tools).

Expert: Structure of expanded queries

Querqy preserves the ‘minimum should match’ semantics for boolean queries as defined in parameter minimum_should_match (Elasticsearch/OpenSearch) / mm (Solr). In order to provide this semantics, given mm=1, the rule

1personal computer =>
2  SYNONYM: pc

produces the query

1boolean_query (mm=1) (
2        dismax('personal','pc'),
3        dismax('computer','pc')

and NOT

1boolean_query(mm=??) (
2        boolean_query(mm=1) (
3                dismax('personal'),
4                dismax('computer')
5        ),
6        dismax('pc')

UP/DOWN rules

UP and DOWN rules add a positive or negative boost query to the user query, which helps to bring documents that match the boost query further up or down in the result list.

The following rules add UP and DOWN queries to the input query ‘iphone’. The UP instruction promotes documents also containing ‘apple’ further to the top of the result list, while the DOWN query puts documents containing ‘case’ further down the search results:

1iphone =>
2  UP(10): apple
3  DOWN(20): case

UP and DOWN both take boost factors as parameters. The default boost factor is 1.0. The interpretation of the boost factor is left to the search engine. However, UP(10):x and DOWN(10):x should normally equal each other out.

By default, the right-hand side of UP and DOWN instructions will be parsed using a simple parser that splits on whitespace and marks tokens prefixed by - as ‘must not match’ and tokens starting with + as ‘must match’. (See ‘querqyParser’ in the configuration to set a different parser.)

A special case are right-hand side definitions that start with *. The string following the * will be treated as a query in the syntax of the search engine.

In the following example we favour a certain price range as an interpretation of ‘cheap’ and penalise documents from category ‘accessories’:

1cheap notebook =>
2         UP(10): * {"range": {"price": {"gte": 350, "lte": 450}}}
3         DOWN(20): * {"term": {"category": "accessories"}}

FILTER rules

Filter rules work similar to UP and DOWN rules, but instead of moving search results up or down the result list they restrict search results to those that match the filter query. The following rule looks similar to the ‘iphone’ example above but it restricts the search results to documents that contain ‘apple’ and not ‘case’:

1iphone =>
2        FILTER: apple
3        FILTER: -case

The filter is applied to all query fields defined in the generated.query_fields or query_fields in Elasticsearch/OpenSearch or gqf or qf in Solr. In the case of a required keyword (‘apple’) the filter matches if the keyword occurs in one or more query fields. The negative filter (‘-case’) only matches documents where the keyword occurs in none of the query fields.

The right-hand side of filter instructions accepts raw queries. To completely exclude results from category ‘accessories’ for query ‘notebook’:

1notebook =>
2   FILTER: * {"bool": { "must_not": [ {"term": {"category":"accessories"}}]}}

DELETE rules

Delete rules allow you to remove keywords from a query. This is comparable to stopwords. In Querqy keywords are removed before starting the field analysis chain. Delete rules are thus field-independent.

It often makes sense to apply delete rules in a separate rewriter in the rewrite chain before applying all other rules. This helps to remove stopwords that could otherwise prevent further Querqy rules from matching.

The following rule declares that whenever Querqy sees the input ‘cheap iphone’ it should remove keyword ‘cheap’ from the query and only search for ‘iphone’:

1cheap iphone =>
2  DELETE: cheap

While in this example the keyword ‘cheap’ will only be deleted if it is followed by ‘iphone’, you can also delete keywords regardless of the context:

1cheap =>
2  DELETE: cheap

or simply:

1cheap =>

If the right-hand side of the delete instruction contains more than one term, each term will be removed from the query individually (= they are not considered a phrase and further terms can occur between them):

1cheap iphone unlocked =>
2  DELETE: cheap unlocked

The above rule would turn the input query ‘cheap iphone unlocked’ into search query ‘iphone’.

The following restrictions apply to delete rules:

  • Terms to be deleted must be part of the input declaration.

  • Querqy will not delete the only term in a query.


This feature is only available for Solr.

Properties: ordering, filtering and tracking of rules

Imagine you have defined a rule for input ‘notebook’ that pushes documents containing ‘bag’ to the end of the search results. This will make a lot of sense: users searching for ‘notebook’ are probably less interested in notebook accessories, they are probably looking for a proper laptop computer!

As a next step you might want to create a rule for the query ‘notebook backpack’, maybe promoting all backpacks that fit 15” notebooks to the top of the search result list. But your first rule gets into the way: backpacks that fit 15” notebooks get pushed to the end of the search results if they contain the term ‘bag’.

You can of course put ‘notebook’ into double quotes when you define the down boost of documents containing ‘bag’ so that the rule would only match the exact query ‘notebook’ but not ‘notebook backpack’. But this would reduce the query coverage of this rule a lot. The rule would still be useful for queries like ‘notebook 16gb’ or ‘acer notebook’.

Using rule properties will give you an alternative solution to the problem as they allow for very flexible context-dependent rule ordering and selection. For example, you can also use rule properties to select rules per search platform tenant or per search user cohort.

Defining properties

There are two ways to define properties, both using the @ character. The first syntax declares one property per line:

 1notebook =>
 2  SYNONYM: laptop
 3  DOWN(100): case
 4  @_id: "ID1"
 5  @_log: "notebook,modified 2019-04-03"
 6  @group: "electronics"
 7  @enabled: true
 8  @priority: 100
 9  @tenant: ["t1", "t2", "t3"]
10  @culture: {"lang": "en", "country": ["gb", "us"]}

The format of this syntax is

@<property name>: <property value> (on a single line)

where the property value allows for all value formats that are known from JSON (string, number, boolean).

The second format represents the properties as a JSON object. It can span over multiple lines. The beginning is marked by @{ and the end by }@:

 1notebook =>
 2  SYNONYM: laptop
 3  DOWN(100): case
 4  @{
 5    _id: "ID1",
 6    _log: "notebook,modified 2019-04-03",
 7    group: "electronics",
 8    enabled: true,
 9    priority: 100,
10    tenant: ["t1",   "t2", "t3"],
11    culture: {
12      "lang": "en",
13      "country": ["gb", "us"]
14    }
15  }@

Property names can also be put in quotes. Both, single quotes and double quotes, are allowed.

Property names starting with _ (like _id and _log) have a special meaning in Querqy (See for example: ‘Info logging’ in Logging and debugging rewriters)

Both formats can be mixed, however, the multi-line JSON object format must be used only once per rule:

 1notebook =>
 2  SYNONYM: laptop
 3  DOWN(100): case
 4  @_id: "ID1"
 5  @_log: "notebook,modified 2019-04-03"
 6  @enabled: true
 7  @{
 8    group: "electronics",
 9    priority: 100,
10    tenant: ["t1",   "t2", "t3"],
11    culture: {
12      "lang": "en",
13      "country": ["gb", "us"]
14    }
15 }@

Using properties for rule ordering and selection

We will use the following example to explain the use of properties for rule ordering and selection:

 1notebook =>
 2  DOWN(100): bag
 3  @enabled: false
 4  @{
 5    _id: "ID1",
 6    priority: 5,
 7    group: "electronics",
 8    tenant: ["t1", "t3"]
 9  }@
12 notebook backpack =>
13   UP(100): 15"
14   @enabled: true
15   @{
16     _id: "ID2",
17     priority: 10,
18     group: "accessories in electronics",
19     tenant: ["t2", "t3"],
20     culture: {
21       "lang": "en",
22       "country": ["gb", "us"]
23     }
24  }@

The two rules illustrate the problem that we described above: The first rule (‘ID1’) defines a down boost for all documents containing ‘bag’ if the query contains ‘notebook’. This makes sense as users probably are less interested in notebook bags when they search for a notebook. Except, if they search for ‘notebook backpack’ - in this case we would not want to apply rule ID1 but only rule ID2. Properties will help us solve this problem by ordering and selecting rules depending on the context.

sort and limit

We will tell Querqy to only apply the first rule after sorting them by the ‘priority’ property in descending order.

POST /myindex/_search

 2  "query": {
 3      "querqy": {
 4          "matching_query": {
 5              "query": "notebook"
 6          },
 7          "query_fields": [ "title^3.0", "brand^2.1", "shortSummary"],
 8          "rewriters": [
 9              {
10                  "name": "common_rules",
11                  "params": {
12                      "criteria": {
13                          "sort": "priority desc",
14                          "limit": 1
15                      }
16                  }
17              }
18          ]
19      }
20  }

sort specifies the property to sort by and the sort order, which can take the values ‘asc’ and ‘desc’

We set limit to 1 so that only the first rule after ordering will be applied.

Looking at the example rules again, the second rule won when we sorted the rules by descending priority and limit the number of rules to be applied to 1 because the second rule has a priority of 10 while the first rule only has a priority of 5. But what happens if we had a second matching rule with priority 10 that we also want to apply? We could of course set limit=2 but what if someone added a third rule with the same priority? - we couldn’t keep changing the limit parameter, especially as we had to know per query how many rules there are for the top priority value.

The problem can be solved by adding another parameter:

POST /myindex/_search

 2  "query": {
 3      "querqy": {
 4          "matching_query": {
 5              "query": "notebook"
 6          },
 7          "query_fields": [ "title^3.0", "brand^2.1", "shortSummary"],
 8          "rewriters": [
 9              {
10                  "name": "common_rules",
11                  "params": {
12                      "criteria": {
13                          "sort": "priority desc",
14                          "limit": 1,
15                          "limitByLevel": true
16                      }
17                  }
18              }
19          ]
20      }
21  }

limitByLevel will change the interpretation of the ‘limit’ parameter: if set to true, rules that have the same ‘priority’ (or any other sort criterion) will only count as 1 towards the limit. For example, limit=2 would select the first 5 elements in the list [10, 10, 8, 8, 8, 5, 4, 4].


Rules can also be filtered by properties using JsonPath expressions, where the general parameter syntax is:

 2  "query": {
 3      "querqy": {
 4          "rewriters": [
 5              {
 6                  "name": "common_rules",
 7                  "params": {
 8                      "criteria": {
 9                          "filter": "<JsonPath expression>"
10                      }
11                  }
12              }
13          ]
14      }
15  }

The properties that where defined at a given Querqy rule are considered a JSON document and a rule filter matches the rule if the JsonPath expression matches this JSON document. What follows is a list of examples that relate to the above rule definitions:

  • $[?(@.enabled == true)] matches ID2 but not ID1

  • $[?( == 'electronics')] matches ID1 but not ID2

  • $[?(@.priority > 5)] matches ID2 but not ID1

  • $[?('t1' in @.tenant)] matches ID1 but not ID2

  • $[?(@.priority > 1 && @.culture)].culture[?(@.lang=='en)] matches ID2 but not ID1. The expression [?(@.priority > 1 && @.culture)] first tests for priority > 1 (both rules matching) and then for the existence of ‘culture’, which only matches ID2. Then the ‘lang’ property of ‘culture’ is tested for matching ‘en’.

If more than one ‘filter’ request parameters are passed to a Common Rules Rewriter, a rule must match all filters to get applied.

Using properties for info logging

Querqy rewriters can emit info log messages that can be directed to various log message sinks. If configured, the Common Rules Rewriter will emit the value of the _log property as a log message. If this property is not defined, it will use the value of _id property and fall back to an auto-generated id if the ‘_id’ property also wasn’t specified.


See Configuring and applying a rewriter for instructions how to add a rewriter to the rewrite chain.



PUT  /_querqy/rewriter/common_rules

2  "class": "querqy.elasticsearch.rewriter.SimpleCommonRulesRewriterFactory",
3  "config": {
4      "rules" : "notebook =>\nSYNONYM: laptop",
5      "ignoreCase": true,
6      "querqyParser": "querqy.rewrite.commonrules.WhiteSpaceQuerqyParserFactory"
7  }

The rule definitions

Default: (empty = no rules)


Ignore case in input matching for rules?

Default: true


The querqy.rewrite.commonrules.QuerqyParserFactory to use for parsing strings from the right-hand side of rules into query objects

Default: querqy.rewrite.commonrules.WhiteSpaceQuerqyParserFactory


 2  "query": {
 3      "querqy": {
 4          "rewriters": [
 5              {
 6                  "name": "common_rules",
 7                  "params": {
 8                      "criteria": {
 9                          "filter": "<JsonPath expression>",
10                          "sort": "<sort property asc|desc>",
11                          "limit": 1,
12                          "limitByLevel": true
13                      }
14                  }
15              }
16          ]
17      }
18  }

Only apply rules that match the filter. A JsonPath expression that is evaluated against JSON documents that are created from the properties of the rules.

Default: (not set)


Sort rules by a rule property in ascending or descending order ‘<property> <asc|desc’>)

Default: (not set - rules are sorted in the order of definition)


The number of rules to apply after sorting

Default: (not set - apply all rules)


If true, rules having the same sort property value are counted only once towards the limit.

Default: false