如何在eclipse或myeclipse调试mapreduce程序,这个可能是初学mr程序者碰到的一个难题
在hadoop1.2.1后,在下载的源代码中找不到hadoop-eclipse-plugin相关的jar或源代码。
其实hadoop目前使用maven进行源代码的管理与调试,可以参考文献:
http://blog.cloudera.com/blog/2012/08/developing-cdh-applications-with-maven-and-eclipse/
A sample POM for setting up a basic Maven project for CDH application development
https://gist.github.com/jnatkins/3517129
注意:CDH是hadoop的封装版本,很稳定,并且更新也很快。
如果需要在eclipse下编写MR程序并进行调试,需要以下前提条件:
1:安装maven,建议安装maven3.0.4或上以版本
2:使用eclipse较新的版本,如Kepler Service Release 1
3:在eclipse上安装m2eclipse插件进行maven项目的管理
4:创建maven项目,并将pom.xml替换为https://gist.github.com/jnatkins/3517129项目中的pom.xml或下面的pom.xml(本人使用的,可以开发MR程序,操作HDFS等)
5:如果出现权限问题(如在windows下调试),可以将Administrator的用户名修改为hdfs即可。
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.cdh</groupId>
<artifactId>cdh-test</artifactId>
<version>SNAPSHOT-1.0.0</version>
<packaging>jar</packaging>
<name>cdh-test</name>
<url>http://maven.apache.org</url>
<properties>
<hadoop.version>2.0.0-mr1-cdh4.4.0</hadoop.version>
<hbase.version>0.94.6-cdh4.4.0</hbase.version>
<project.build.sourceEncoding>utf-8</project.build.sourceEncoding>
<maven.compiler.encoding>utf-8</maven.compiler.encoding>
</properties>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<encoding>utf-8</encoding>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-eclipse-plugin</artifactId>
<version>2.9</version>
<configuration>
<buildOutputDirectory>eclipse-classes</buildOutputDirectory>
<downloadSources>true</downloadSources>
<downloadJavadocs>false</downloadJavadocs>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.8.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.0.0-cdh4.4.0</version>
<exclusions>
<exclusion>
<artifactId>
jersey-test-framework-grizzly2
</artifactId>
<groupId>
com.sun.jersey.jersey-test-framework
</groupId>
</exclusion>
<exclusion>
<artifactId>netty</artifactId>
<groupId>org.jboss.netty</groupId>
</exclusion>
</exclusions>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase</artifactId>
<version>${hbase.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.hadoop.gplcompression</groupId>
<artifactId>hadoop-lzo-cdh4</artifactId>
<version>0.4.15-gplextras</version>
</dependency>
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
<dependency>
<groupId>org.hsqldb</groupId>
<artifactId>hsqldb</artifactId>
<version>2.2.9</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
</project>