基于Yarn API的Spark程序监控

2019-07-26 4962

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 一.简述通过对Yarn ResourceManager中运行程序的状态（RUNNING、KILLED、FAILED、FINISHED）以及ApplicationMaster中Application的Job执行时长超过批次时间的监控，来达到对Spark on Yarn程序的失败重启、超时重启等功能二.

一.简述

基于对Yarn ResourceManager中运行程序的状态（RUNNING、KILLED、FAILED、FINISHED）以及ApplicationMaster中Application的Job执行时长超过批次时间的监控，来达到对Spark on Yarn程序的失败重启、超时重启等功能

二.Yarn主要的几类API

1.查询整个集群指标

GET http:// http address:port>/ws/v1/cluster/metrics

2.查询集群调度器详情

GET http:// http address:port>/ws/v1/cluster/scheduler

3.监控任务

curl 'http:// http address:port>/ws/v1/cluster/apps//state'
GET http:// http address:port>/ws/v1/cluster/apps//state

4.查看指定任务

GET http:// http address:port>/ws/v1/cluster/apps/

5.查看指定任务的详细信息

curl http:// http address:port>/proxy//ws/v2/mapreduce/info"

6.杀死任务

yarn application -kill application_id
curl -v -X PUT -d '{"state": "KILLED"}''http:// http address:port>/ws/v1/cluster/apps/'
PUT http:// http address:port>/ws/v1/cluster/apps//state

三.YarnMonitor

Ⅰ. Setup

1. install yarn-api-client

you mast install yarn-api-client when you use this yarn monitor

python setup.py build
python setup.py install

2. uninstall yarn-api-client

when you need uninstall this yarn-api-client model,use this

pip list
pip uninstall yarn-api-client

3. upate yarn-api-client

when you need update python model,you need unintall and update

update yarn-api-client
cp yarn-api-client/base.py base.py.bak

4. offline intall python model

when you need intall other python model,you can do this

pip freeze > yarn.txt
mkdir yarnpackage
pip install --no-index --find-links=yarnpackage/ -r yarn.txt

Ⅱ. Command

you should modify the script permissions

chmod 774 start_prmsbd.sh

Ⅲ. Crontab

configure crontab task

crontab -l
crontab -e

1.start yarn monitor

*/1 * * * * ../yarnmonitor/yarn-monitor/command/start_yarn_monitor.sh >> ../yarnmonitor/yarn-monitor/logs/yarn-corntab.log 2>&1 
* * * * * sleep 60; ../yarnmonitor/yarn-monitor/command/start_yarn_monitor.sh
        
          
        
        
        
          
          AI 代码解读

2.clear yarn monitor log

0 4 * * * ../yarn-monitor/command/clear_log_opm.sh >> ../yarn-monitor/logs/yarn-corntab.log 2>&1

0 4 * * * ../yarnmonitor/yarn-monitor/command/clear_log_opm.sh >> ../yarnmonitor/yarn-monitor/logs/yarn-corntab.log 2>&1
        
          
        
        
        
          
          AI 代码解读

Ⅳ. Just for test

1. start yarn command

../python ./yarn-monitor/YarnMonitor.py

2. Application_Master API

curl --compressed -H "Accept: application/json" -X GET "http://***:8088/ws/v1/cluster/apps"
curl --compressed -H "Accept: application/json" -X GET "http://***:8088/ws/v1/cluster/apps"
curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1549963435527_0001/ws/v1/mapreduce/info"

curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v1/mapreduce/info"
curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v2/mapreduce/info"

curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1535085750394_0017/ws/v1/mapreduce/jobs/4536"

curl --compressed -H "Accept: application/json" -X GET "http://***:8088/proxy/application_1548125170651_0090/api/v1/applications"
        
          
        
        
        
          
          AI 代码解读

四.问题

其中，在ApplicationMaster中查询Job的返回数据无法转json的异常时，需修改yarn-api-client中修改对应API返回数据,可参考：

if 'ws/v1/mapreduce/info' in path:
            if response.status == OK:
                html_content = response.read()
                element_html = etree.HTML(html_content)
                tr_list = element_html.xpath('//tbody/tr')
                content_list = []
                for tr in tr_list:
                    item = {}
                    item['id'] = tr.xpath('./td[1]/text()')[0].replace('\n', '').strip()
                    item['duration'] = tr.xpath('./td[4]/text()')[0]
                    # 打印每条信息
                    # logging.info(item)
                    content_list.append(item)
                    # print content_list
                    return content_list
                response.close()
                return self.response_class(content_list)
            else:
                msg = 'Response finished with status: %s' % response.status
                raise APIError(msg)
        
          
        
        
        
          
          AI 代码解读

基于Yarn API的Spark程序监控

一.简述

二.Yarn主要的几类API

1.查询整个集群指标

2.查询集群调度器详情

3.监控任务

4.查看指定任务

5.查看指定任务的详细信息

6.杀死任务

三.YarnMonitor

Ⅰ. Setup

1. install yarn-api-client

2. uninstall yarn-api-client

3. upate yarn-api-client

4. offline intall python model

Ⅱ. Command

Ⅲ. Crontab

1.start yarn monitor

2.clear yarn monitor log

Ⅳ. Just for test

1. start yarn command

2. Application_Master API

四.问题

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

基于Yarn API的Spark程序监控

一.简述

二.Yarn主要的几类API

1.查询整个集群指标

2.查询集群调度器详情

3.监控任务

4.查看指定任务

5.查看指定任务的详细信息

6.杀死任务

三.YarnMonitor

Ⅰ. Setup

1. install yarn-api-client

2. uninstall yarn-api-client

3. upate yarn-api-client

4. offline intall python model

Ⅱ. Command

Ⅲ. Crontab

1.start yarn monitor

2.clear yarn monitor log

Ⅳ. Just for test

1. start yarn command

2. Application_Master API

四.问题

热门文章

最新文章

相关课程

相关电子书

相关实验场景