Querqy 4 for Solr¶
This page documents Querqy 4 for Solr. Querqy 4 has been superseded by the
current version of Querqy for Solr, which manages rewriter configurations
via a REST API rather than solrconfig.xml. If you are running Querqy 4
and considering an upgrade, see the
migration guide.
Installation¶
Add the Querqy query parser and query component to your solrconfig.xml:
<!--
Add the Querqy query parser.
-->
<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin"/>
<!--
Override the default QueryComponent.
-->
<searchComponent name="query" class="querqy.solr.QuerqyQueryComponent"/>
Making Queries¶
The Querqy query parser is enabled using the defType parameter:
/solr/mycollection/select?q=notebook&defType=querqy&qf=title^3.0 brand^2.1 shortSummary
The rewrite chain configured in solrconfig.xml is automatically applied to
every request.
Configuring Rewriters¶
In Querqy 4, rewriters are configured as child elements of the query parser in
solrconfig.xml. Together they form the rewrite chain, applied in the order
in which they are defined:
1<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin">
2
3 <lst name="rewriteChain">
4
5 <!--
6 Common Rules Rewriter
7 -->
8 <lst name="rewriter">
9
10 <str name="id">commonRules</str>
11
12 <str name="class">querqy.solr.SimpleCommonRulesRewriterFactory</str>
13
14 <!--
15 The file that contains rules for synonyms, boosting etc.
16 -->
17 <str name="rules">rules.txt</str>
18
19 </lst>
20
21 <!--
22
23 You can add more rewriters here
24
25 <lst name="rewriter">
26 <str name="id">rewriter2</str>
27 <str name="class">...</str>
28 ....
29 </lst>
30
31 <lst name="rewriter">
32 <str name="class">...</str>
33 ....
34 </lst>
35
36 -->
37
38
39</queryParser>
The lst element rewriteChain (line #6) serves as a container for the
rewriters.
Each rewriter is defined in a rewriter lst element (#11).
All rewriters must have a class property (#15) that specifies a factory for
creating the rewriter.
The id property (#13) is optional. In some cases the id is used to route
request parameters to a specific rewriter.
The ‘id’ and ‘class’ properties are the only properties that are available for all rewriters. Rewriters can have additional properties that will only have a meaning for the specific rewriter implementation.
In the example, the rules property specifies the resource that contains rule
definitions for the ‘Common Rules Rewriter’. Resources are files that are either
kept in ZooKeeper as part of the configset (SolrCloud) or in the ‘conf’ folder
of a Solr core in standalone or master-slave Solr. They can be gzipped, which
will be auto-detected by Querqy, regardless of the file name. If you keep your
files in ZooKeeper, remember the maximum file size in ZooKeeper (default: 1 MB).
Rewriter Configurations¶
Common Rules Rewriter¶
The rules for the Common Rules Rewriter are maintained in the resource configured
as the rules property of the SimpleCommonRulesRewriterFactory:
1<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin">
2 <lst name="rewriteChain">
3 <lst name="rewriter">
4 <str name="class">querqy.solr.SimpleCommonRulesRewriterFactory</str>
5 <str name="rules">rules.txt</str>
6 </lst>
7 </lst>
8</queryParser>
The rules file must be in UTF-8 character encoding. The maximum file size is 1 MB if Solr runs as SolrCloud and if you did not change the maximum file size in ZooKeeper. The file can be gzipped — Querqy will automatically detect and decompress it.
Configuration reference:
<lst name="rewriter">
<str name="class">querqy.solr.SimpleCommonRulesRewriterFactory</str>
<str name="rules">rules.txt</str>
<bool name="ignoreCase">true</bool>
<bool name="buildTermCache">true</bool>
<str name="boostMethod">MULTIPLICATIVE</str>
<str name="querqyParser">querqy.rewrite.commonrules.WhiteSpaceQuerqyParserFactory</str>
</lst>
- rules
The rule definitions file. The file is kept in the configset of the collection in ZooKeeper (SolrCloud) or in the ‘conf’ folder of the Solr core in standalone or master-slave Solr. Can be gzipped (max 1 MB in ZooKeeper).
Required.
- ignoreCase
Ignore case in input matching for rules?
Default:
true- buildTermCache
Whether to build a term cache from matching terms. This is an optimisation that might not be feasible for very large rule lists.
Default:
true- boostMethod
How to combine UP/DOWN boosts with the score of the main user query. Available methods are ADDITIVE and MULTIPLICATIVE.
Default:
ADDITIVE- querqyParser
The
querqy.rewrite.commonrules.QuerqyParserFactoryto use for parsing strings from the right-hand side of rules into query objects.Default:
querqy.rewrite.commonrules.WhiteSpaceQuerqyParserFactory
Rule selection by property¶
To use rule selection (filtering rules by property), the rewriter must have an
id configured in solrconfig.xml. This id is then referenced in request
parameters:
<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin">
<lst name="rewriteChain">
<lst name="rewriter">
<!--
Note the rewriter ID:
-->
<str name="id">common1</str>
<str name="class">querqy.solr.SimpleCommonRulesRewriterFactory</str>
<str name="rules">rules.txt</str>
<!-- ... -->
</lst>
</lst>
</queryParser>
Rule selection request parameters:
querqy.common1.criteria.sort=priority desc
querqy.common1.criteria.limit=1
The parameters have the prefix querqy.<rewriterID>.criteria where the
rewriter ID matches the id configured in solrconfig.xml.
Replace Rewriter¶
1<lst name="rewriter">
2 <str name="class">querqy.solr.contrib.ReplaceRewriterFactory</str>
3 <str name="rules">replace-rules.txt</str>
4 <str name="ignoreCase">true</str>
5 <str name="inputDelimiter">;</str>
6 <str name="querqyParser">querqy.rewrite.commonrules.WhiteSpaceQuerqyParserFactory</str>
7</lst>
The rules property references a file in ZooKeeper (SolrCloud) or in the
conf directory (standalone) that contains the replace rules. The property
ignoreCase defines whether the rewriter differentiates between upper- and
lowercase when matching query terms (default: true). The property
inputDelimiter enables configuring multiple input definitions for the same
output, separated by the configured delimiter (default is tab).
Word Break Rewriter¶
1<lst name="rewriter">
2 <str name="class">querqy.solr.contrib.WordBreakCompoundRewriterFactory</str>
3 <str name="dictionaryField">f1</str>
4 <bool name="lowerCaseInput">true</bool>
5 <int name="decompound.maxExpansions">5</int>
6 <bool name="decompound.verifyCollation">true</bool>
7 <str name="morphology">GERMAN</str>
8 <arr name="reverseCompoundTriggerWords">
9 <str>for</str>
10 </arr>
11 <arr name="protectedWords">
12 <str>slipper</str>
13 <str>wissenschaft</str>
14 </arr>
15</lst>
Number-Unit Rewriter¶
1<lst name="rewriter">
2 <str name="class">querqy.solr.contrib.NumberUnitRewriterFactory</str>
3 <str name="config">number-unit-config.json</str>
4</lst>
The config property references a JSON configuration file in ZooKeeper
(SolrCloud) or in the conf directory (standalone).
Shingle Rewriter¶
<lst name="rewriter">
<str name="class">querqy.solr.contrib.ShingleRewriterFactory</str>
<bool name="acceptGeneratedTerms">false</bool>
</lst>
- acceptGeneratedTerms
If true, also create shingle tokens from terms that were created by other rewriters earlier in the rewrite chain.
Default:
false
Advanced Configuration¶
Term Query Cache¶
The term query cache avoids building Lucene queries for sub-queries that never match in specific fields.
Version-independent cache configuration (solrconfig.xml):
<query>
<!-- Place a custom cache in the <query> section: -->
<cache name="querqyTermQueryCache"
class="solr.LFUCache"
size="1024"
initialSize="1024"
autowarmCount="0"
regenerator="solr.NoOpRegenerator"
/>
<listener event="firstSearcher" class="querqy.solr.TermQueryCachePreloader">
<str name="fields">f1 f2</str>
<str name="qParserPlugin">querqy</str>
<str name="cacheName">querqyTermQueryCache</str>
<bool name="testForHits">true</bool>
</listener>
<listener event="newSearcher" class="querqy.solr.TermQueryCachePreloader">
<str name="fields">f1 f2</str>
<str name="qParserPlugin">querqy</str>
<str name="cacheName">querqyTermQueryCache</str>
<bool name="testForHits">true</bool>
</listener>
</query>
Tell the Querqy query parser to use the custom cache:
<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin">
<str name="termQueryCache.name">querqyTermQueryCache</str>
<bool name="termQueryCache.update">false</bool>
<lst name="rewriteChain">
<!-- ... -->
</lst>
</queryParser>
The Query String Parser¶
The query string parser defines how the query string passed in request parameter
q is parsed. It can be set using a parser element in the configuration:
<queryParser name="querqy" class="querqy.solr.DefaultQuerqyDismaxQParserPlugin">
<str name="parser">querqy.parser.WhiteSpaceQuerqyParser</str>
<!-- ... -->
</queryParser>
The default WhiteSpaceQuerqyParser is sufficient for most use cases.
Migrating to the Current Version¶
For detailed information about changes and a migration guide from Querqy 4 to the current version, see Migrating to Querqy 5 for Solr.