Sunday, November 26, 2017

How to calculate the overlap / elapsed time range in elasticsearch?

Leave a Comment

I have some records in ES, they are different online meeting records that people join/leave at the different time.

{name:"p1", join:'2017-11-17T00:01:00.293Z', leave: "2017-11-17T00:06:00.293Z"} {name:"p2", join:'2017-11-17T00:02:00.293Z', leave: "2017-11-17T00:04:00.293Z"} {name:"p3", join:'2017-11-17T00:03:00.293Z', leave: "2017-11-17T00:05:00.293Z"} 

Time range could be something like this:

 p1: [============================================]  p2:         [=================]  p3:                  [==================] 

The question is how to calculate the overlap time range (common/meeting/shared time), which should be 3 min

Another further question is that is it possible to know when to when there is 1/2/3 people at that time? 2 mins 2 persons; 1 min 3 persons

2 Answers

Answers 1

I don't think its possible to do only with ES. Simply because all you need is that in search it should go to all documents that matched and calculate based on that

I would do it in following steps.

1.Before indexing new document search for documents which overlaps.

GET /meetings/_search  {   "query": {     "bool": {       "must": [         {           "range": {             "join": {               "gte": "2007-10-01T00:00:00"             }           }         },         {           "range": {             "leave": {               "lte": "2007-10-01T00:00:00"             }           }         }       ]     }   } } 
  1. Calculate all functionality on back-end for all documents that overlaps.
  2. Save to to documents as nested object overlaps metadata you need

Answers 2

You can do the first part easily using max(join) and min(leave):

GET your_index/your_type/_search {   "size": 0,   "aggs": {     "startTime": {       "max": {         "field": "join"       }     },     "endTime": {       "min": {         "field": "leave"       }     }   } } 

And then you can compute endTime-startTime either when you process Elasticsearch response or using a bucket script aggregation. It may be negative in which case there is no overlap.

For the second one, it depends of what you want: If you want the exact boundaries, which may be hard to read, you can do it using a Scripted Metric Aggregation.

If you want to have the number per slot (hour for instance) it may be easier to use a Date Histogram Aggregation.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment