impala 1.2.3升级遇到的问题

简介:

升级impala 1.2.3时遇到的一些问题,记录下。

1.catalog默认的jvm参数问题

catalog启动一段时间后,报"OutOfMemoryError: GC overhead limit exceeded"错误
/var/run/impala/hs_err_pidxxxx.log file:
 The java heap info:
Heap
PSYoungGen      total 904768K, used 402833K [0x00000007ad2b0000, 0x0000000800000000, 0x0000000800000000)
 eden space 452416K, 89% used [0x00000007ad2b0000,0x00000007c5c14448,0x00000007c8c80000)
 from space 452352K, 0% used [0x00000007c8c80000,0x00000007c8c80000,0x00000007e4640000)
 to   space 452352K, 0% used [0x00000007e4640000,0x00000007e4640000,0x0000000800000000)
PSOldGen        total 2714304K, used 2714303K [0x0000000707800000, 0x00000007ad2b0000, 0x00000007ad2b0000)
 object space 2714304K, 99% used [0x0000000707800000,0x00000007ad2affe8,0x00000007ad2b0000)
PSPermGen       total 38848K, used 38569K [0x0000000702600000, 0x0000000704bf0000, 0x0000000707800000)
 object space 38848K, 99% used [0x0000000702600000,0x0000000704baa608,0x0000000704bf0000)

the jstat info:
 jstat -gcutil 8589 1000 1000
 S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT  
 0.00   0.00 100.00 100.00  99.09    115    9.170    92  540.636  549.806
 0.00   0.00 100.00 100.00  99.09    115    9.170    92  540.636  549.806
 0.00   0.00  96.98 100.00  99.02    115    9.170    92  546.807  555.977
 0.00   0.00 100.00 100.00  99.02    115    9.170    93  546.807  555.977
 0.00   0.00 100.00 100.00  99.02    115    9.170    93  546.807  555.977
 0.00   0.00 100.00 100.00  99.02    115    9.170    93  546.807  555.977
0.00   0.00 100.00 100.00  99.02    115    9.170    93  546.807  555.977
 0.00   0.00 100.00 100.00  99.02    115    9.170    93  546.807  555.977

You can pass JVM arguments (including changes to the heap size) to catalogd using the "JAVA_TOOL_OPTIONS" environment variable. If you are using CM, you can set this environment variable using the "Catalog Server Environment Safety Valve".
We have also made a number of improvements to the catalog memory footprint in the upcoming Impala v1.2.4 release (which should be out next week if all goes well).Hope this helps.

通过设置下面变量可以解决
export JAVA_TOOL_OPTIONS="-Xmx8000m -Xms8000m  -Xmn1024m -XX:PermSize=256m -XX:PermSize=256m -XX:SurvivorRatio=8 -XX:+UseCompressedOops -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+CMSParallelRemarkEnabled -XX:+DisableExplicitGC -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70 -XX:SoftRefLRUPolicyMSPerMB=0 -Dcom.sun.management.jmxremote  -Xnoclassgc -Xloggc:/apps/logs/jvm/catalog-$(date +%Y%m%d-%H%M%S).log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=8060 -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=xxxxxxx"


2.catalog加载table metadata问题

ERROR: AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.impala server:
I0214 17:02:01.526229 36166 Frontend.java:443] analyze query use viplog
I0214 17:02:01.576381 36166 jni-util.cc:154] com.cloudera.impala.common.AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.
       at com.cloudera.impala.analysis.Analyzer.getCatalog(Analyzer.java:650)
       at com.cloudera.impala.analysis.Analyzer.getDb(Analyzer.java:1326)
       at com.cloudera.impala.analysis.UseStmt.analyze(UseStmt.java:44)
       at com.cloudera.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:318)
       at com.cloudera.impala.service.Frontend.createExecRequest(Frontend.java:444)
       at com.cloudera.impala.service.JniFrontend.createExecRequest(JniFrontend.java:114)

I0214 17:02:01.604887 36166 status.cc:44] AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.


bug id:


This is a known issue fixed in 1.2.4. After this patch, catalogd loads the metadata lazily instead of loading everything at startup.
Impala 1.2.4 will introduce lazy loading, so you will not see this problem. Your error indicates that the catalog has not loaded all the metadata yet. The best way to determine this is from the impalad's metrics URL. Specifically, you want to look for the value "catalog.ready" to be set to 1 on the impalad debug webpage. You can find the metrics in <impala_host>:25000/metrics.
(impala-server.ready:1,catalog.ready:1)


3.udf问题

We currently don't support String as the input and return types. You'll instead have to  use Text or BytesWritable.

I've filed IMPALA-791 to track fixing this.  https://issues.cloudera.org/browse/IMPALA-791

可以通过org.apache.hadoop.io.Text类代替String类解决

Text类api:
4.catalog内存问题,不知道是不是内存泄露,一段时间后old区就满了,导致OOM
7G的old区:
root@GD6G12S190-logserver impala]# jstat -gcutil 10454 1000 1000
 S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT  
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568
 0.00 100.00 100.00 100.00  16.24    572   63.441    22  196.127  259.568


本文转自菜菜光 51CTO博客,原文链接:http://blog.51cto.com/caiguangguang/1360215,如需转载请自行联系原作者
相关文章
|
2月前
|
SQL 存储 分布式计算
bigdata-29-Impala初步了解
bigdata-29-Impala初步了解
12 0
|
5月前
|
SQL 分布式计算 Hadoop
Apache Impala 的安装部署
Apache Impala 的安装部署
103 0
|
9月前
|
安全 网络安全 数据安全/隐私保护
Cydia Impactor 常见报错及原因
Cydia Impactor 常见报错及原因
174 0
|
SQL 存储 分布式计算
Impala 架构了解
Impala 架构了解
Impala 架构了解
|
SQL XML 分布式计算
CDH 搭 建_Impala|学习笔记
快速学习 CDH 搭 建_Impala
415 0
CDH 搭 建_Impala|学习笔记
|
SQL 分布式计算 Java
KuduSpark_Impala 访问 Kudu | 学习笔记
快速学习 KuduSpark_Impala 访问 Kudu
258 0
KuduSpark_Impala 访问 Kudu | 学习笔记
Impala——2.架构
标签(空格分隔): Impala Impala Server的组件 Impala服务器是分布式,大规模并行处理(MPP)数据库引擎。它由不同的在群集中的特定主机上运行的守护程序进程组成。 Impala守护进程 核心Impala组件是一个守护进程,它通过impalad进程在集群的每个DataNode上运行。
1656 0
Impala——1.概述
标签(空格分隔): Impala Impala是什么 官方论文 Impala对存储在HDFS,HBase的Apache Hadoop数据和存储在Amazon S3上的数据提供快速,交互式的SQL查询。
1576 0
|
分布式计算 Java Hadoop