工作流单元测试
1、工作流定义配置上传
[hadoop@hdp-node-01 wf-oozie]$ hadoop fs -put hive2-etl /user/hadoop/oozie/myapps/ [hadoop@hdp-node-01 wf-oozie]$ hadoop fs -put hive2-dw /user/hadoop/oozie/myapps/ [hadoop@hdp-node-01 wf-oozie]$ ll total 12 drwxrwxr-x. 2 hadoop hadoop 4096 Nov 23 16:32 hive2-dw drwxrwxr-x. 2 hadoop hadoop 4096 Nov 23 16:32 hive2-etl drwxrwxr-x. 3 hadoop hadoop 4096 Nov 23 11:24 weblog [hadoop@hdp-node-01 wf-oozie]$ export OOZIE_URL=http://localhost:11000/oozie |
2、工作流单元提交启动
oozie job -D inpath=/weblog/input -D outpath=/weblog/outpre-config weblog/job.properties -run
启动etl的hive工作流
oozie job -config hive2-etl/job.properties -run
启动pvs统计的hive工作流
oozie job -config hive2-dw/job.properties -run
3、工作流coordinator配置(片段)
多个工作流job用coordinator组织协调:
[hadoop@hdp-node-01 hive2-etl]$ ll total 28 -rw-rw-r--. 1 hadoop hadoop 265 Nov 13 16:39 config-default.xml -rw-rw-r--. 1 hadoop hadoop 512 Nov 26 16:43 coordinator.xml -rw-rw-r--. 1 hadoop hadoop 382 Nov 26 16:49 job.properties drwxrwxr-x. 2 hadoop hadoop 4096 Nov 27 11:26 lib -rw-rw-r--. 1 hadoop hadoop 1910 Nov 23 17:49 script.q -rw-rw-r--. 1 hadoop hadoop 687 Nov 23 16:32 workflow.xml |
l config-default.xml
<configuration> <property> <name>jobTracker</name> <value>hdp-node-01:8032</value> </property> <property> <name>nameNode</name> <value>hdfs://hdp-node-01:9000</value> </property> <property> <name>queueName</name> <value>default</value> </property> </configuration> |
l job.properties
user.name=hadoop oozie.use.system.libpath=true oozie.libpath=hdfs://hdp-node-01:9000/user/hadoop/share/lib oozie.wf.application.path=hdfs://hdp-node-01:9000/user/hadoop/oozie/myapps/hive2-etl/ |
l workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.5" name="hive2-wf"> <start to="hive2-node"/>
<action name="hive2-node"> <hive2 xmlns="uri:oozie:hive2-action:0.1"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> </configuration> <jdbc-url>jdbc:hive2://hdp-node-01:10000</jdbc-url> <script>script.q</script> <param>input=/weblog/outpre2</param> </hive2> <ok to="end"/> <error to="fail"/> </action>
<kill name="fail"> <message>Hive2 (Beeline) action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app> |
l coordinator.xml
<coordinator-app name="cron-coord" frequency="${coord:minutes(5)}" start="${start}" end="${end}" timezone="Asia/Shanghai" xmlns="uri:oozie:coordinator:0.2"> <action> <workflow> <app-path>${workflowAppUri}</app-path> <configuration> <property> <name>jobTracker</name> <value>${jobTracker}</value> </property> <property> <name>nameNode</name> <value>${nameNode}</value> </property> <property> <name>queueName</name> <value>${queueName}</value> </property> </configuration> </workflow> </action> </coordinator-app> |
模块开发——数据展示
在企业的数据分析系统中,前端展现工具有很多,
l 独立部署专门系统的方式:以BusinessObjects(BO,Crystal Report),Heperion(Brio),Cognos等国外产品为代表的,它们的服务器是单独部署的,与应用程序之间通过某种协议沟通信息
l 有WEB程序展现方式:通过独立的或者嵌入式的java web系统来读取报表统计结果,以网页的形式对结果进行展现,如,100%纯Java的润乾报表
本日志分析项目采用自己开发web程序展现的方式
u Web展现程序采用的技术框架:
Jquery + Echarts + springmvc + spring + mybatis + mysql
u 展现的流程:
1. 使用ssh从mysql中读取要展现的数据
2. 使用json格式将读取到的数据返回给页面
3. 在页面上用echarts对json解析并形成图标
Web程序工程结构
采用maven管理工程,引入SSH框架依赖及jquery+echarts的js库
Web程序的实现代码
采用典型的MVC架构实现
页面 |
HTML + JQUERY + ECHARTS |
Controller |
SpringMVC |
Service |
Service |
DAO |
Mybatis |
数据库 |
Mysql |
代码详情见项目工程
代码示例:ChartServiceImpl
@Service("chartService") public class ChartServiceImpl implements IChartService { @Autowired IEchartsDao iEchartsDao;
public EchartsData getChartsData() { List<Integer> xAxiesList = iEchartsDao.getXAxiesList(""); List<Integer> pointsDataList = iEchartsDao.getPointsDataList("");
EchartsData data = new EchartsData(); ToolBox toolBox = EchartsOptionUtil.getToolBox(); Serie serie = EchartsOptionUtil.getSerie(pointsDataList); ArrayList<Serie> series = new ArrayList<Serie>(); series.add(serie);
List<XAxi> xAxis = EchartsOptionUtil.getXAxis(xAxiesList); List<YAxi> yAxis = EchartsOptionUtil.getYAxis();
HashMap<String, String> title = new HashMap<String, String>(); title.put("text", "pvs"); title.put("subtext", "超级pvs"); HashMap<String, String> tooltip = new HashMap<String, String>(); tooltip.put("trigger", "axis");
HashMap<String, String[]> legend = new HashMap<String, String[]>(); legend.put("data", new String[]{"pv统计"});
data.setTitle(title); data.setTooltip(tooltip); data.setLegend(legend); data.setToolbox(toolBox); data.setCalculable(true); data.setxAxis(xAxis); data.setyAxis(yAxis); data.setSeries(series); return data; }
public List<HashMap<String, Integer>> getGaiKuangList(String date) throws ParseException{
HashMap<String, Integer> gaiKuangToday = iEchartsDao.getGaiKuang(date); SimpleDateFormat sf = new SimpleDateFormat("MMdd"); Date parse = sf.parse(date); Calendar calendar = Calendar.getInstance(); calendar.setTime(parse); calendar.add(Calendar.DAY_OF_MONTH, -1); Date before = calendar.getTime(); String beforeString = sf.format(before); System.out.println(beforeString);
HashMap<String, Integer> gaiKuangBefore = iEchartsDao.getGaiKuang(beforeString);
ArrayList<HashMap<String, Integer>> gaiKuangList = new ArrayList<HashMap<String, Integer>>(); gaiKuangList.add(gaiKuangToday); gaiKuangList.add(gaiKuangBefore);
return gaiKuangList;
}
public static void main(String[] args) { ChartServiceImpl chartServiceImpl = new ChartServiceImpl(); EchartsData chartsData = chartServiceImpl.getChartsData(); Gson gson = new Gson(); String json = gson.toJson(chartsData); System.out.println(json);
} } |
Web程序的展现效果
网站概况
流量分析
来源分析
访客分析
OVER,整个数据项目实战到此结束!