第一步: 将examples目录下的hive目录拷贝到oozie-apps目录下
cp -r hive ../../oozie-apps/
第二步: 将hive命名为hive-select
mv hive hive-select
第三步: 将hive-select里面的README和workflow.xml.security文件删掉
rm -rf README workflow.xml.security
第四步: 创建lib目录
mkdir lib
第五步: 基于官方模板example,编写job.properties,内容如下:
nameNode=hdfs://bigdata-pro-m01:9000 jobTracker=bigdata-pro-m01:8032 oozieAppRoot=user/caizhengjie/oozie-apps oozieDataRoot=user/caizhengjie/oozie-datas queueName=default oozie.use.system.libpath=true oozie.wf.application.path=${nameNode}/${oozieAppRoot}/hive-select/workflow.xml outputDir=hive-select/output
第六步: 在HDFS上创建数据输出的目录
bin/hdfs dfs -mkdir -p /user/caizhengjie/oozie-datas/hive-select/output
第七步: 基于官方模板example,编写workflow.xml,内容如下:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive-wf"> <start to="hive-node"/> <action name="hive-node"> <hive xmlns="uri:oozie:hive-action:0.6"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/${oozieDataRoot}/${outputDir}"/> </prepare> <job-xml>${nameNode}/${oozieAppRoot}/hive-select/hive-site.xml</job-xml> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <script>script.q</script> <param>OUTPUT=${nameNode}/${oozieDataRoot}/${outputDir}</param> </hive> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
第八步: 拷贝mysql jar包到lib目录下
cp /opt/modules/hive/lib/mysql-connector-java-5.1.48-bin.jar .
第九步: 拷贝hive-site.xml文件到hive-select目录下
cp /opt/modules/hive/conf/hive-site.xml .
第十步: 编写script.q sql文件,内容如下:
insert overwrite directory '${OUTPUT}' select * from db_hive.order
第十一步: 上传hive-select整个目录到hdfs上
bin/hdfs dfs -put /opt/modules/oozie/oozie-apps/hive-select/ /user/caizhengjie/oozie-apps
第十二步: 启动metastore
bin/hive --service metastore
第十三步: 运行命令测试:
bin/oozie job -oozie http://bigdata-pro-m01:11000/oozie -config oozie-apps/hive-select/job.properties -run
运行结果查看:
查看hive分析出的数据
bin/hdfs dfs -text /user/caizhengjie/oozie-datas/hive-select/output/000000_0
0001cenry2018-10-09product-1350guangzhou 0002beny2018-09-09product-2180beijing 0003ben2018-09-09product-3580beijing 0004cherry2018-10-09product-4450shenzheng 0005jone2018-10-09product-530nanjing 0006lili2018-09-09product-650hangzhou 0007chen2018-10-09product-790wuhan 0008wiwi2018-09-09product-8150chengdu