Hadoop2.7实战v1.0之YARN HA

简介: YARN HA实战v1.0 当前环境:hadoop+zookeeper(namenode,resourcemanager HA)                  resourcemanager ...

YARN HA实战v1.0

当前环境:hadoop+zookeeper(namenode,resourcemanager HA)   

              resourcemanager

serviceId

init status

sht-sgmhadoopnn-01

rm1

active

sht-sgmhadoopnn-02

rm2

standby

参考:
http://blog.csdn.net/u011414200/article/details/50336735

http://blog.csdn.net/u011414200/article/details/50276257

.查看resourcemanageractive还是standby

1.打开网页

http://172.16.101.55:8088/cluster/cluster

http://172.16.101.56:8088/cluster/cluster

 

2.查看resourcemanager日志

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-01 logs]# more yarn-root-resourcemanager-sht-sgmhadoopnn-01.telenav.cn.log
  2. …………………..
  3. 2016-03-03 18:10:01,289 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to active state

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-02 logs]# more yarn-root-resourcemanager-sht-sgmhadoopnn-02.telenav.cn.log
  2. …………………..
  3. 2016-03-03 18:10:34,250 INFO org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsPublisher: YARN system metrics publishing service is not enabled
  4. 2016-03-03 18:10:34,250 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state

3. yarn rmadmin –getServiceState 


点击(此处)折叠或打开

  1. ###$HADOOP_HOME/etc/hadoop/yarn-site.xml
  2.               <property>
  3.         <name>yarn.resourcemanager.ha.rm-ids</name>
  4.         <value>rm1,rm2</value>
  5.     </property>


  6. [root@sht-sgmhadoopnn-02 logs]# yarn rmadmin -getServiceState rm1
  7. active
  8. [root@sht-sgmhadoopnn-02 logs]# yarn rmadmin -getServiceState rm2
  9. standby
. 命令

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-01 bin]# yarn --help
  2. Usage: yarn [--config confdir] [COMMAND | CLASSNAME]
  3.   CLASSNAME run the class named CLASSNAME
  4.  or
  5.   where COMMAND is one of:
  6.   resourcemanager -format-state-store deletes the RMStateStore
  7.   resourcemanager run the ResourceManager
  8.   nodemanager run a nodemanager on each slave
  9.   timelineserver run the timeline server
  10.   rmadmin admin tools
  11.   sharedcachemanager run the SharedCacheManager daemon
  12.   scmadmin SharedCacheManager admin tools
  13.   version print the version
  14.   jar <jar> run a jar file
  15.   application prints application(s)
  16.                                         report/kill application
  17.   applicationattempt prints applicationattempt(s)
  18.                                         report
  19.   container prints container(s) report
  20.   node prints node report(s)
  21.   queue prints queue information
  22.   logs dump container logs
  23.   classpath prints the class path needed to
  24.                                         get the Hadoop jar and the
  25.                                         required libraries
  26.   cluster prints cluster information
  27.   daemonlog get/set the log level for each
  28.                                         daemon

  29. ###########################################################################
  30. [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin --help
  31. -help: Unknown command
  32. Usage: yarn rmadmin
  33.    -refreshQueues
  34.    -refreshNodes
  35.    -refreshSuperUserGroupsConfiguration
  36.    -refreshUserToGroupsMappings
  37.    -refreshAdminAcls
  38.    -refreshServiceAcl
  39.    -getGroups [username]
  40.    -addToClusterNodeLabels [label1,label2,label3] (label splitted by ",")
  41.    -removeFromClusterNodeLabels [label1,label2,label3] (label splitted by ",")
  42.    -replaceLabelsOnNode [node1[:port]=label1,label2 node2[:port]=label1,label2]
  43.    -directlyAccessNodeLabelStore
  44.    -transitionToActive [--forceactive] <serviceId>
  45.    -transitionToStandby <serviceId>
  46.    -failover [--forcefence] [--forceactive] <serviceId> <serviceId>
  47.    -getServiceState <serviceId>
  48.    -checkHealth <serviceId>
  49.    -help [cmd]
transitionToActive transitionToStandby 是用于在不同状态之间切换的。

failover 初始化一个故障恢复。该命令会从一个失效的resourcemanager切换到另一个上面(不支持在自动切换的环境操作)

getServiceState 获取当前resourcemanager的状态。

checkHealth 检查resourcemanager的状态。正常就返回0,否则返回非0值。

.实验

1.测试YARN的手工切换功能(失败)

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -failover --forceactive rm1 rm2
  2. forcefence and forceactive flags not supported with auto-failover enabled.

#yarn-site.xml 中设置yarn.resourcemanager.ha.automatic-failover.enabled true,故提示不能手动切换

2.测试YARN的自动切换功能(成功)

 a.active resoucemanager机器上通过jps命令查找到resoucemanager的进程号,然后通过kill -9的方式杀掉进程,观察另一个resoucemanager节点是否会从状态standby变成active状态

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -getServiceState rm1
  2. active
  3. [root@sht-sgmhadoopnn-01 ~]# yarn rmadmin -getServiceState rm2
  4. standby

  5. [root@sht-sgmhadoopnn-01 bin]# jps
  6. 2583 Jps
  7. 10162 DFSZKFailoverController
  8. 28432 ResourceManager
  9. 21679 NameNode

  10. [root@sht-sgmhadoopnn-01 ~]# kill -9 28432

  11. [root@sht-sgmhadoopnn-02 bin]# jps
  12. 19147 ResourceManager
  13. 17837 NameNode
  14. 17970 DFSZKFailoverController
  15. 27330 Jps


  16. [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin -getServiceState rm1
  17. 16/03/03 19:23:39 INFO ipc.Client: Retrying connect to server: sht-sgmhadoopnn-01/172.16.101.55:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=1, sleepTime=1000 MILLISECONDS)
  18. Operation failed: Call From sht-sgmhadoopnn-01.telenav.cn/172.16.101.55 to sht-sgmhadoopnn-01:8033 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused



  19. [root@sht-sgmhadoopnn-01 bin]# yarn rmadmin -getServiceState rm2
  20. active
  21. [root@sht-sgmhadoopnn-01 bin]#

#### sht-sgmhadoopnn-01 机器上resourcemanager进程已经起来,且状态为standby

  c. 再次切换

点击(此处)折叠或打开

  1. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToStandby rm2
  2. Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@11f69937
  3. Refusing to manually manage HA state, since it may cause
  4. a split-brain scenario or other incorrect state.
  5. If you are very sure you know what you are doing, please
  6. specify the --forcemanual flag.

  7. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToStandby --forcemanual rm2
  8. You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably.

  9. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state.

  10. You may abort safely by answering 'n' or hitting ^C now.

  11. Are you sure you want to continue? (Y or N) Y
  12. 16/03/03 19:29:36 WARN ha.HAAdmin: Proceeding with manual HA state management even though
  13. automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@4e33967b


  14. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm1
  15. standby
  16. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm2
  17. standby

  18. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToActive rm1
  19. Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@54c4f317
  20. Refusing to manually manage HA state, since it may cause
  21. a split-brain scenario or other incorrect state.
  22. If you are very sure you know what you are doing, please
  23. specify the --forcemanual flag.

  24. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -transitionToActive --forcemanual rm1
  25. You have specified the --forcemanual flag. This flag is dangerous, as it can induce a split-brain scenario that WILL CORRUPT your HDFS namespace, possibly irrecoverably.

  26. It is recommended not to use this flag, but instead to shut down the cluster and disable automatic failover if you prefer to manually manage your HA state.

  27. You may abort safely by answering 'n' or hitting ^C now.

  28. Are you sure you want to continue? (Y or N) Y
  29. 16/03/03 19:32:46 WARN ha.HAAdmin: Proceeding with manual HA state management even though
  30. automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@54c4f317

  31.  [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm1
  32. 16/03/03 19:32:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  33. active
  34. [root@sht-sgmhadoopnn-01 sbin]# yarn rmadmin -getServiceState rm2
  35. 16/03/03 19:33:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  36. standby
  37. [root@sht-sgmhadoopnn-01 sbin]#

#HDFS HA切换实验不一样, -transitionToStandby,会自动将standby—>active,active-->standby;

YARN HA就不一样,需要还有手动再执行一下-transitionToActive.

目录
相关文章
|
8天前
|
分布式计算 Hadoop Devops
Hadoop集群配置https实战案例
本文提供了一个实战案例,详细介绍了如何在Hadoop集群中配置HTTPS,包括生成私钥和证书文件、配置keystore和truststore、修改hdfs-site.xml和ssl-client.xml文件,以及重启Hadoop集群的步骤,并提供了一些常见问题的故障排除方法。
18 3
Hadoop集群配置https实战案例
|
9天前
|
分布式计算 资源调度 Hadoop
Hadoop YARN资源管理-容量调度器(Yahoo!的Capacity Scheduler)
详细讲解了Hadoop YARN资源管理中的容量调度器(Yahoo!的Capacity Scheduler),包括队列和子队列的概念、Apache Hadoop的容量调度器默认队列、队列的命名规则、分层队列、容量保证、队列弹性、容量调度器的元素、集群如何分配资源、限制用户容量、限制应用程序数量、抢占申请、启用容量调度器以及队列状态管理等方面的内容。
26 3
|
9天前
|
分布式计算 监控 Hadoop
监控Hadoop集群实战篇
介绍了监控Hadoop集群的方法,包括监控Linux服务器、Hadoop指标、使用Ganglia监控Hadoop集群、Hadoop日志记录、通过Hadoop的Web UI进行监控以及其他Hadoop组件的监控,并提供了相关监控工具和资源的推荐阅读链接。
21 2
|
9天前
|
分布式计算 资源调度 Hadoop
Hadoop YARN资源管理-公平调度器(Fackbook的Fair Scheduler)
详细介绍了Hadoop YARN资源管理中的公平调度器(Fair Scheduler),包括其概述、配置、队列结构、以及如何将作业提交到指定队列,展示了公平调度器如何通过分配文件(fair-scheduler.xml)来控制资源分配,并提供了配置示例和如何通过命令行提交作业到特定队列的方法。
25 0
Hadoop YARN资源管理-公平调度器(Fackbook的Fair Scheduler)
|
11天前
|
图形学 数据可视化 开发者
超实用Unity Shader Graph教程:从零开始打造令人惊叹的游戏视觉特效,让你的作品瞬间高大上,附带示例代码与详细步骤解析!
【8月更文挑战第31天】Unity Shader Graph 是 Unity 引擎中的强大工具,通过可视化编程帮助开发者轻松创建复杂且炫酷的视觉效果。本文将指导你使用 Shader Graph 实现三种效果:彩虹色渐变着色器、动态光效和水波纹效果。首先确保安装最新版 Unity 并启用 Shader Graph。创建新材质和着色器图谱后,利用节点库中的预定义节点,在编辑区连接节点定义着色器行为。
45 0
|
19天前
|
资源调度 分布式计算 Hadoop
揭秘Hadoop Yarn背后的秘密!它是如何化身‘资源大师’,让大数据处理秒变高效大戏的?
【8月更文挑战第24天】在大数据领域,Hadoop Yarn(另一种资源协调者)作为Hadoop生态的核心组件,扮演着关键角色。Yarn通过其ResourceManager、NodeManager、ApplicationMaster及Container等组件,实现了集群资源的有效管理和作业调度。当MapReduce任务提交时,Yarn不仅高效分配所需资源,还能确保任务按序执行。无论是处理Map阶段还是Reduce阶段的数据,Yarn都能优化资源配置,保障任务流畅运行。此外,Yarn还在Spark等框架中展现出灵活性,支持不同模式下的作业执行。未来,Yarn将持续助力大数据技术的发展与创新。
27 2
|
19天前
|
资源调度 分布式计算 Hadoop
揭秘Hadoop Yarn三大调度器:如何玩转资源分配,实现高效集群管理?
【8月更文挑战第24天】Hadoop YARN(Another Resource Negotiator)是一款强大的集群资源管理工具,主要负责高效分配及管理Hadoop集群中的计算资源。本文深入剖析了YARN的三种调度器:容量调度器(Capacity Scheduler)、公平调度器(Fair Scheduler)以及FIFO调度器,并通过具体的配置示例和Java代码展示了它们的工作机制。
30 2
|
30天前
|
资源调度 分布式计算 Hadoop
Hadoop YARN 的作用
【8月更文挑战第12天】
35 4
|
11天前
|
图形学 C# 开发者
Unity粒子系统全解析:从基础设置到高级编程技巧,教你轻松玩转绚丽多彩的视觉特效,打造震撼游戏画面的终极指南
【8月更文挑战第31天】粒子系统是Unity引擎的强大功能,可创建动态视觉效果,如火焰、爆炸等。本文介绍如何在Unity中使用粒子系统,并提供示例代码。首先创建粒子系统,然后调整Emission、Shape、Color over Lifetime等模块参数,实现所需效果。此外,还可通过C#脚本实现更复杂的粒子效果,增强游戏视觉冲击力和沉浸感。
30 0
|
19天前
|
资源调度 分布式计算 监控
【揭秘Hadoop YARN背后的奥秘!】从零开始,带你深入了解YARN资源管理框架的核心架构与实战应用!
【8月更文挑战第24天】Hadoop YARN(Yet Another Resource Negotiator)是Hadoop生态系统中的资源管理器,为Hadoop集群上的应用提供统一的资源管理和调度框架。YARN通过ResourceManager、NodeManager和ApplicationMaster三大核心组件实现高效集群资源利用及多框架支持。本文剖析YARN架构及组件工作原理,并通过示例代码展示如何运行简单的MapReduce任务,帮助读者深入了解YARN机制及其在大数据处理中的应用价值。
34 0

相关实验场景

更多