WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Tuning Best Fields Queriesedit
What would happen if the user had searched instead for “quick pets”? Both
documents contain the word quick
, but only document 2 contains the word
pets
. Neither document contains both words in the same field.
A simple dis_max
query like the following would choose the single best
matching field, and ignore the other:
{ "query": { "dis_max": { "queries": [ { "match": { "title": "Quick pets" }}, { "match": { "body": "Quick pets" }} ] } } }
{ "hits": [ { "_id": "1", "_score": 0.12713557, "_source": { "title": "Quick brown rabbits", "body": "Brown rabbits are commonly seen." } }, { "_id": "2", "_score": 0.12713557, "_source": { "title": "Keeping pets healthy", "body": "My quick brown fox eats rabbits on a regular basis." } } ] }
We would probably expect documents that match on both the title
field and
the body
field to rank higher than documents that match on just one field,
but this isn’t the case. Remember: the dis_max
query simply uses the
_score
from the single best-matching clause.
tie_breakeredit
It is possible, however, to also take the _score
from the other matching
clauses into account, by specifying the tie_breaker
parameter:
{ "query": { "dis_max": { "queries": [ { "match": { "title": "Quick pets" }}, { "match": { "body": "Quick pets" }} ], "tie_breaker": 0.3 } } }
This gives us the following results:
{ "hits": [ { "_id": "2", "_score": 0.14757764, "_source": { "title": "Keeping pets healthy", "body": "My quick brown fox eats rabbits on a regular basis." } }, { "_id": "1", "_score": 0.124275915, "_source": { "title": "Quick brown rabbits", "body": "Brown rabbits are commonly seen." } } ] }
The tie_breaker
parameter makes the dis_max
query behave more like a
halfway house between dis_max
and bool
. It changes the score calculation
as follows:
-
Take the
_score
of the best-matching clause. -
Multiply the score of each of the other matching clauses by the
tie_breaker
. - Add them all together and normalize.
With the tie_breaker
, all matching clauses count, but the best-matching
clause counts most.
The tie_breaker
can be a floating-point value between 0
and 1
, where 0
uses just the best-matching clause and 1
counts all matching clauses
equally. The exact value can be tuned based on your data and queries, but a
reasonable value should be close to zero, (for example, 0.1 - 0.4
), in order not to
overwhelm the best-matching nature of dis_max
.