Thursday, June 29, 2017

Improving performance of hive jdbc

Leave a Comment

Does aynyone know how to increase performance for HIVE JDBC connection.

Detailed problem:

When I query hive from Hive CLI, I get a response within 7 sec but from HIVE JDBC connection I get a response after 14 sec. I was wondering if there is any way (configuration changes) with which I can improve performance for query through JDBC connection.

Thanks in advance.

2 Answers

Answers 1

Can you please try the below options.

  1. If your query has joins then try setting the hive.auto.convert.join to true.
  2. Try changing the configuration of Java Heap Size and Garbage Collection reference Link

  3. Change the execution engine to Tez using set hive.execution.engine=tez To check currently set engine use hive.execution.engine.

Other Hive performance configuration tips can be found in the Link

Please let me know the results.

Answers 2

performance improvement by changinge these configuration parameters

-- enable cost based optimizer

set hive.cbo.enable=true;

set hive.compute.query.using.stats=true;

set hive.stats.fetch.column.stats=true;

set hive.stats.fetch.partition.stats=true;

--collects statistics

analyze table compute statistics for columns;

--enable vectorization of queries.

set hive.vectorized.execution.enabled = true;

set hive.vectorized.execution.reduce.enabled = true;

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment