hadoop jython ( windows )-阿里云开发者社区

hadoop jython ( windows )

2017-12-07 1033

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介：

参考： hadoop window 搭建后,由于对 py 的语法喜欢，一直想把hadoop,改成jython 的
这次在自己电脑上终于完成,下面介绍过程:
测试环境：
依然的 windows + cygwin
hadoop 0.18 # C:/cygwin/home/lky/tools/java/hadoop-0.18.3
jython 2.2.1 # C:/jython2.2.1
参考: PythonWordCount
启动 hadoop 并到 hdoop_home 下

# 在云环境中创建 input 目录
$>bin/hadoop dfs -mkdir input
# 在包 hadoop 的 NOTICE.txt 拷贝到 input 目录下
$>bin/hadoop dfs -copyFromLocal c:/cygwin/home/lky/tools/java/hadoop-0.18.3/NOTICE.txt hdfs:///user/lky/input

$>cd src/examples/python

# 创建个脚本 ( jy->jar->hd run ) 一步完成!
# 当然在 linux 写个脚本比这好看呵呵！
$>vim run.bat

" C:\Program Files\Java\jdk1.6.0_11\bin\java.exe " -classpath " C:\jython2.2.1\jython.jar;%CLASSPATH% " org.python.util.jython C:\jython2 .2.1 \Tools\jythonc\jythonc.py -p org.apache.hadoop.examples -d -j wc.jar -c % 1

sh C:\cygwin\home\lky\tools\java\hadoop- 0.18.3 \bin\hadoop jar wc.jar % 2 % 3 % 4 % 5 % 6 % 7 % 8 % 9

# 修改 jythonc 打包环境。 +hadoop jar
$>vim C:\jython2.2.1\Tools\jythonc\jythonc.py

# Copyright (c) Corporation for National Research Initiatives
# Driver script for jythonc2. See module main.py for details
import sys,os,glob

for fn in glob.glob('c:/cygwin/home/lky/tools/java/hadoop-0.18.3/*.jar') :sys.path.append(fn)
for fn in glob.glob('c:/jython2.2.1/*.jar') :sys.path.append(fn)
for fn in glob.glob('c:/cygwin/home/lky/tools/java/hadoop-0.18.3/lib/*.jar' ) :sys.path.append(fn)

import main
main.main()

import os
os._exit(0)

# 运行
C:/cygwin/home/lky/tools/java/hadoop-0.18.3/src/examples/python>
run.bat WordCount.py hdfs:///user/lky/input file:///c:/cygwin/home/lky/tools/java/hadoop-0.18.3/tmp2

结果输出：
cat c:/cygwin/home/lky/tools/java/hadoop-0.18.3/tmp2/part-00000
(http://www.apache.org/).       1
Apache 1
Foundation      1
Software        1
The     1
This    1
by      1
developed       1
includes        1
product 1
software        1
下面重头来了：（简洁的 jy hdoop 代码）

#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

from org.apache.hadoop.fs import Path
from org.apache.hadoop.io import *
from org.apache.hadoop.mapred import *

import sys
import getopt

class WordCountMap(Mapper, MapReduceBase):
    one = IntWritable( 1 )
     def map(self, key, value, output, reporter):
         for w in value.toString().split():
            output.collect(Text(w), self.one)

class Summer(Reducer, MapReduceBase):
     def reduce(self, key, values, output, reporter):
        sum = 0
         while values.hasNext():
            sum += values.next().get()
        output.collect(key, IntWritable(sum))

def printUsage(code):
     print " wordcount [-m <maps>] [-r <reduces>] <input> <output> "
    sys.exit(code)

def main(args):
    conf = JobConf(WordCountMap);
    conf.setJobName( " wordcount " );

    conf.setOutputKeyClass(Text);
    conf.setOutputValueClass(IntWritable);

    conf.setMapperClass(WordCountMap);
    conf.setCombinerClass(Summer);
    conf.setReducerClass(Summer);
     try :
        flags, other_args = getopt.getopt(args[ 1 :], " m:r: " )
     except getopt.GetoptError:
        printUsage( 1 )
     if len(other_args) != 2 :
        printUsage( 1 )

     for f,v in flags:
         if f == " -m " :
            conf.setNumMapTasks(int(v))
         elif f == " -r " :
            conf.setNumReduceTasks(int(v))
    conf.setInputPath(Path(other_args[0]))
    conf.setOutputPath(Path(other_args[ 1 ]))
    JobClient.runJob(conf);

if __name__ == " __main__ " :
    main(sys.argv)

本文转自博客园刘凯毅的博客，原文链接：hadoop jython ( windows )，如需转载请自行联系原博主。

hadoop jython ( windows )

热门文章

最新文章

相关课程

相关电子书

相关实验场景

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

hadoop jython ( windows )

热门文章

最新文章

相关课程

相关电子书

相关实验场景