How to configue session timeout in Hive

简介: This article explains how to configure the following settings in Hive:hive.server2.
This article explains how to configure the following settings in Hive:


hive.server2.session.check.interval
hive.server2.idle.operation.timeout

hive.server2.idle.session.timeout


1). hive.server2.idle.session.timeout
Session will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero or a negative value.
For example, the value of “86400000” indicate that the session will be timed out after 1 day of inactivity.


2). hive.server2.session.check.interval
The check interval for session/operation timeout, in milliseconds, which can be disabled by setting to zero or a negative value.
For example, the value of “3600000” indicate that the session will be checked every 1 hour.


3) hive.server2.idle.operation.timeout
Operation will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero. For a positive value, checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR). For a negative value, checked for all of the operations regardless of state.
For example, the value of “7200000” indicate that the query/operation will be timed out after 2 hours if it is still running.


So if you combine the three settings from above examples, we can summarize the following use cases:


1) If you started a HS2 session, beeline for example, and without doing anything afterwards, HS2 will trigger 24 session checking before it determines that 24 hours has passed since last activity, then session will be closed


2) If you worked on beeline for 2 hours and then leave the beeline open without doing anything afterwards, HS2 will trigger total of 26 session checking (2 while you worked and another 24 while in idle), and the session will be closed 26 hours after initially opened.


3) If you worked on beeline for 2 hours, and you started running a query that will run for 1 hour and then returns result, the idle timer actually starts from the time when data returns, so if you don’t do anything afterwards, HS2 will kill the session after another 24 hours, so in total, the session lasted 27 hours (2+1+24)


4) If, for instance, your query takes longer than the 2 hours defined by hive.server2.idle.operation.timeout, then the query will be cancelled, hence you will lose the result all together


5) If hive.server2.session.check.interval = 0 and hive.server2.idle.session.timeout > 0, then it will have the same effect that hive.server2.idle.session.timeout = 0, because no idle timer will be triggered as check interval is disabled


6) If hive.server2.session.check.interval > hive.server2.idle.session.timeout > 0, then the actual time out will be the same as hive.server2.session.check.interval.


For example if hive.server2.session.check.interval = 20 minutes, but hive.server2.idle.session.timeout = 10 minutes, then timeout will happen when checking happens, which is 20 minutes.


The basic rule of thumb is as follows:
hive.server2.session.check.interval < hive.server2.idle.operation.timeout < hive.server2.idle.session.timeout


And the recommended values are:


hive.server2.session.check.interval = 1 hour
hive.server2.idle.operation.timeout = 1 day
hive.server2.idle.session.timeout = 3 days


This will work for most of clusters, but you can change the value depending on how your cluster behaves and how long each query run for.
目录
相关文章
|
SQL 索引
简单了解RBO、CBO和HBO
简单了解RBO、CBO和HBO
|
分布式计算 Kubernetes Java
spark on k8s 镜像构建
spark on k8s 镜像构建
877 0
|
开发工具 git
解决:‘git‘ 不是内部或外部命令,也不是可运行的程序
解决:‘git‘ 不是内部或外部命令,也不是可运行的程序
解决:‘git‘ 不是内部或外部命令,也不是可运行的程序
|
消息中间件 SQL Kafka
离线数仓(四)【数仓数据同步策略】(1)
离线数仓(四)【数仓数据同步策略】
|
安全 Java 开发者
深入解析ReentrantLock重入锁:Java多线程中的利器
深入解析ReentrantLock重入锁:Java多线程中的利器
2646 4
|
Java Maven
修改配置maven镜像仓库位置,将maven镜像更换成阿里镜像
修改配置maven镜像仓库位置,将maven镜像更换成阿里镜像
5358 0
|
Linux
linux cat查看文件使用grep实现多条件多场景过滤
linux cat查看文件使用grep实现多条件多场景过滤
765 0
springboot 各种文件下载方式(最全)
springboot 各种文件下载方式(最全)
4912 0
|
机器学习/深度学习 分布式计算 Hadoop
记一次HDFS报EOFException异常的问题
现象 大晚上的收到线上DataNode挂掉异常的报警,值班同学随即做了重启处理,重启完成后,进程虽然在运行,但是NameNode的WebUI上显示大量的block丢失。 There are 12622047 missing blocks. Number of Under-Replicated Blocks 14436901 重新启动的DataNode节点block数量为0,明显不正常 HDFS在对丢失的block做恢复,missing blocks的数量在减少,但是丢失的的太多了,恢复速度很慢,这种情况肯定不能指望集群自动恢复的。
1449 0