I'm trying to query an array in ElasticSearch
data: "names":[{"name":"allen"},{"name":"bill"},{"name":"dave"},{"name":"poter"}] goal: "select names from table where array_contains(names.name, "bill")" but spark won't do predicate pushdown if SQL statement use array_contains function.
hint: names.name = ["allen","bill","dave","poter"]
I've tried
select * from table where array_contains(names.name,"bill") -- and select explode(names.name) as name from table as t1;select * from t1 where name = "bill" -- and select * from table where cast(names.name as string) like '%bill%' All failed to do pushdown, any other ways to do it?
1 Answers
Answers 1
The failure to pushdown is expected. For predicate to be delegated you need a Data Source support, and ElasticSearch connector doesn't list array_contains among pushed operations, which as today include:
=,=>,<,>=,<=is_null/is_not_nullinString[Starts|Ends]With,StringContainsNULLsafe equality.- Application of Boolean operators
AND/OR/NOT.
Also any additional transformations (including CAST) disable predicate pushdown.
0 comments:
Post a Comment