How to configue session timeout in Hive

简介: This article explains how to configure the following settings in Hive:hive.server2.
This article explains how to configure the following settings in Hive:



1). hive.server2.idle.session.timeout
Session will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero or a negative value.
For example, the value of “86400000” indicate that the session will be timed out after 1 day of inactivity.

2). hive.server2.session.check.interval
The check interval for session/operation timeout, in milliseconds, which can be disabled by setting to zero or a negative value.
For example, the value of “3600000” indicate that the session will be checked every 1 hour.

3) hive.server2.idle.operation.timeout
Operation will be closed when not accessed for this duration of time, in milliseconds; disable by setting to zero. For a positive value, checked for operations in terminal state only (FINISHED, CANCELED, CLOSED, ERROR). For a negative value, checked for all of the operations regardless of state.
For example, the value of “7200000” indicate that the query/operation will be timed out after 2 hours if it is still running.

So if you combine the three settings from above examples, we can summarize the following use cases:

1) If you started a HS2 session, beeline for example, and without doing anything afterwards, HS2 will trigger 24 session checking before it determines that 24 hours has passed since last activity, then session will be closed

2) If you worked on beeline for 2 hours and then leave the beeline open without doing anything afterwards, HS2 will trigger total of 26 session checking (2 while you worked and another 24 while in idle), and the session will be closed 26 hours after initially opened.

3) If you worked on beeline for 2 hours, and you started running a query that will run for 1 hour and then returns result, the idle timer actually starts from the time when data returns, so if you don’t do anything afterwards, HS2 will kill the session after another 24 hours, so in total, the session lasted 27 hours (2+1+24)

4) If, for instance, your query takes longer than the 2 hours defined by hive.server2.idle.operation.timeout, then the query will be cancelled, hence you will lose the result all together

5) If hive.server2.session.check.interval = 0 and hive.server2.idle.session.timeout > 0, then it will have the same effect that hive.server2.idle.session.timeout = 0, because no idle timer will be triggered as check interval is disabled

6) If hive.server2.session.check.interval > hive.server2.idle.session.timeout > 0, then the actual time out will be the same as hive.server2.session.check.interval.

For example if hive.server2.session.check.interval = 20 minutes, but hive.server2.idle.session.timeout = 10 minutes, then timeout will happen when checking happens, which is 20 minutes.

The basic rule of thumb is as follows:
hive.server2.session.check.interval < hive.server2.idle.operation.timeout < hive.server2.idle.session.timeout

And the recommended values are:

hive.server2.session.check.interval = 1 hour
hive.server2.idle.operation.timeout = 1 day
hive.server2.idle.session.timeout = 3 days

This will work for most of clusters, but you can change the value depending on how your cluster behaves and how long each query run for.
SQL Java Scala
flink-cdc SQL Server op 字段如何获取?
Flink CDC 是 Apache Flink 的组件,用于捕获数据库变更事件。对 SQL Server,通过 Debezium 连接器支持变更数据捕获。`op` 字段标识操作类型(INSERT、UPDATE、DELETE)。配置包括添加依赖及设定 Source 连接器,可通过 Flink SQL 或 Java/Scala 完成。示例查询利用 `op` 字段筛选处理变更事件。
185 1
SQL 存储 分布式计算
Hive Delegation Token 揭秘
本篇文章是由一次 Hive 集群生产优化而引出的知识点,供大家参考
196 2
Hive操作超时错误:Session 0x0 for server null
Hive操作超时错误:Session 0x0 for server null
94 1
SQL Java Scala
Flink SQL Client初探
体验Flink SQL Client
214 1
Flink SQL Client初探
SQL 分布式计算 数据管理
spark SQL配置连接Hive Metastore 3.1.2
Hive Metastore作为元数据管理中心,支持多种计算引擎的读取操作,例如Flink、Presto、Spark等。本文讲述通过spark SQL配置连接Hive Metastore,并以3.1.2版本为例。
spark SQL配置连接Hive Metastore 3.1.2
SQL 分布式计算 数据可视化
SQL 流计算
SQL Server中的“最大并行度”的配置建议
原文:SQL Server中的“最大并行度”的配置建议 SQL Server中的最大并行度(max degree of parallelism)如何设置呢? 设置max degree of parallelism有什么好的建议和指导方针呢?在微软官方文档Recommendations and gui...
2013 0

