Tuesday, July 17, 2018

Elasticsearch sort: by one from each group and repeat

Leave a Comment

I need to get items with max value from each name and repeat until end.

I'll explain it on simple example. I have such items:

Name| Value ----------- AAA | 12 AAA | 35 AAA | 5 BBB | 1 BBB | 10 BBB | 5 

Expected result after sort:

Name| Value ----------- AAA | 35 BBB | 10 AAA | 12 BBB | 5 AAA | 5 BBB | 1 

I know how to do it in Postgres (window functions: rank() over()), but is it possible in Elastic?

3 Answers

Answers 1

You have to do something like Group by max

Here is Example

GET /yourindex/_search { "size": 0   "aggs": {     "yourGroup": {       "terms": {         "field": "Name",         "size": 10       },       "aggs": {         "theMax": {           "max": {             "field": "Value"           }         }       }     }   } } 

Reference:- this

Answers 2

Aggregating here my comments.

Showing >1 million records is a bad idea no matter how those documents are sorted, when it comes to Elasticsearch. My questions were asked to see how appropriate is to create a second ES index with the results of probably 1 query + post processing and holding something like "first 1000 records" (meaning a human reasonable list of documents) and to update that list periodically (every 10 seconds or so). You could use Watcher to build this index and keep it updated. 1 million records, as I said, is both impractical (who would look at 1mil docs) and not performant from ES point of view.

Basically, keep a separate index which should include only first 1000 documents that are sorted according to your requirements. And this index is updated regularly, not your main one with 1mil documents. Regarding pagination and 1 mil. documents... how many pages do you believe your users will go through?! 10, 15, 20? Not even google.com is giving you everything. Only few tens of pages, even though there can be hundreds of millions of matches. Keep in mind that Elasticsearch is a search engine, not a database. The aim is to give you the best matching docs, not all of them.

The query from Watcher will run over all the documents in your main index. It will aggregate the documents according to your requirements (I think a terms aggregation on Name, ordered by Value), you can add a post-processing step to create the order you need and then index that into a second index. Next time the watch will trigger, it will delete the old index, perform the same query again and index the new results in the (now empty) index.

Answers 3

Elasticsearch supports sorting by array or multi-valued fields. The mode option controls what array value is picked for sorting the document it belongs to. The mode option can have the following values:

min -- Pick the lowest value.

max -- Pick the highest value.

sum -- Use the sum of all values as sort value. Only applicable for number based array fields.

avg -- Use the average of all values as sort value. Only applicable for number based array fields.

median -- Use the median of all values as sort value. Only applicable for number based array fields.

Sort Order: The order option can have the following values:

asc -- Sort in ascending order

desc -- Sort in descending order

Below is a example in which field price has multiple prices per document. In this case the result hits will be sorted by price descending based on the average price per document.

PUT /my_index/_doc/1?refresh {    "product": "chocolate",    "price": [20, 4] }  POST /_search {    "query" : {       "term" : { "product" : "chocolate" }    },    "sort" : [       {"price" : {"order" : "desc", "mode" : "avg"}}    ] } 

Nested sorting example In the below example offer is a field of type nested. The nested path needs to be specified; otherwise, Elasticsearch doesn’t know on what nested level sort values need to be captured.

POST /_search {    "query" : {       "term" : { "product" : "chocolate" }    },    "sort" : [        {           "offer.price" : {              "mode" :  "avg",              "order" : "asc",              "nested": {                 "path": "offer",                 "filter": {                    "term" : { "offer.color" : "blue" }                 }              }           }        }     ] } 

Please refer this link Elastic search sort for detailed explanation and much more examples.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment