大数据日志分析产品——SaaS Cloud, e.g. Papertrail, Loggly, Sumo Logic;Open Source Frameworks, e.g. ELK stack, Graylog;Enterprise Products, e.g. TIBCO LogLogic, I

简介:

Learn how you can maximize big data in the cloud with Apache Hadoop. Download this eBook now. Brought to you in partnership with Hortonworks.

In February 2016, I presented a brand new talk at OOP in Munich: “Comparison of Frameworks and Tools for Big Data Log Analytics and IT Operations Analytics”. The focus of the talk is to discuss different open source frameworks, SaaS cloud offerings and enterprise products for analyzing big masses of distributed log events. This topic is getting much more traction these days with the emerging architecture concept of Microservices.

Key Take-Aways

  • Log Analytics enables IT Operations Analytics for Machine Data
  • Correlation of Events is the Key for Added Business Value
  • Log Management is complementary to other Big Data Components

Log Management with Papertrail, ELK Stack, TIBCO LogLogic, Splunk, etc.

Log Management has been a mature concept for many years; used for troubleshooting, root cause analysis, and solving security issues of devices such as web servers, firewalls, routers, databases, etc. In the meantime, it is also used for analyzing applications and distributed deployments using SOA or Microservices architectures.

The slide deck compares different solutions for log management:

Image title

IT Operations Analytics (ITOA) with TIBCO Unity

IT Operations Analytics is a new, very young market growing strongly (100% year-by-year, according to Gartner). In contrary to Log Management, it does not just focus on analyzing historical data, but also enables to make complex correlations of distributed data to allow predictive analytics in (near) real time. TIBCO Unity is a product heading into this direction. You can integrate log data, but also real time events (e.g. via TIBCO Hawk) to enable monitoring, analysis and complex correlation of distributed Microserices.

What about Apache Hadoop versus Log Management and ITOA?

Why not use just Apache Hadoop? You can also store and analyze all data on its cluster! Why not just use Log Collectors (such as Apache Flume) and send data directly to Hadoop without Log Analytics “in the middle”?

Here are some reasons… Log Management and ITOA tools.

  • Are an integrated solution for data analysis (tooling, consulting, support).
  • Are built exactly for these use cases.
  • Involve data indexing, data processing (querying) and data visualization by means of dashboards and other tools out-of-the-box.
  • Offer easy-of-use tooling and allow fast time-to-market / low TCO.

The following graphic shows the different concepts and when they are usually used:

Image title

Having said that, a better Hadoop integration is possible! It might make sense to leverage both together: the great tooling for Log Management, plus the Hadoop storage with very high scalability for really BIG data. For example, TIBCO Unity uses Apache Kafka under the hood to support processing and scaling millions of messages. Thus, integration with Hadoop storage might be possible in a future release…

Slides

Finally, here is my slide deck:

xxx
 
转自:https://dzone.com/articles/frameworks-and-products-big-data-log-analytics-log















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/6418854.html ,如需转载请自行联系原作者



相关实践学习
基于MaxCompute的热门话题分析
Apsara Clouder大数据专项技能认证配套课程:基于MaxCompute的热门话题分析
相关文章
|
5月前
|
人工智能 分布式计算 DataWorks
大数据AI产品月刊-2025年7月
大数据& AI 产品技术月刊【2025年7月】,涵盖7月技术速递、产品和功能发布、市场和客户应用实践等内容,帮助您快速了解阿里云大数据& AI 方面最新动态。
|
4月前
|
消息中间件 Java Kafka
搭建ELK日志收集,保姆级教程
本文介绍了分布式日志采集的背景及ELK与Kafka的整合应用。传统多服务器环境下,日志查询效率低下,因此需要集中化日志管理。ELK(Elasticsearch、Logstash、Kibana)应运而生,但单独使用ELK在性能上存在瓶颈,故结合Kafka实现高效的日志采集与处理。文章还详细讲解了基于Docker Compose构建ELK+Kafka环境的方法、验证步骤,以及如何在Spring Boot项目中整合ELK+Kafka,并通过Logback配置实现日志的采集与展示。
955 64
搭建ELK日志收集,保姆级教程
|
6月前
|
数据采集 存储 大数据
大数据之路:阿里巴巴大数据实践——日志采集与数据同步
本资料全面介绍大数据处理技术架构,涵盖数据采集、同步、计算与服务全流程。内容包括Web/App端日志采集方案、数据同步工具DataX与TimeTunnel、离线与实时数仓架构、OneData方法论及元数据管理等核心内容,适用于构建企业级数据平台体系。
|
3月前
|
数据采集 缓存 大数据
【赵渝强老师】大数据日志采集引擎Flume
Apache Flume 是一个分布式、可靠的数据采集系统,支持从多种数据源收集日志信息,并传输至指定目的地。其核心架构由Source、Channel、Sink三组件构成,通过Event封装数据,保障高效与可靠传输。
269 1
|
8月前
|
人工智能 分布式计算 大数据
大数据& AI 产品月刊【2025年4月】
大数据& AI 产品技术月刊【2025年4月】,涵盖4月技术速递、产品和功能发布、市场和客户应用实践等内容,帮助您快速了解阿里云大数据& AI 方面最新动态。
|
9月前
|
数据采集 机器学习/深度学习 人工智能
面向 MoE 和推理模型时代:阿里云大数据 AI 产品升级发布
2025 AI 势能大会上,阿里云大数据 AI 平台持续创新,贴合 MoE 架构、Reasoning Model 、 Agentic RAG、MCP 等新趋势,带来计算范式变革。多款大数据及 AI 产品重磅升级,助力企业客户高效地构建 AI 模型并落地 AI 应用。
|
4月前
|
人工智能 分布式计算 DataWorks
阿里云大数据AI产品月刊-2025年8月
阿里云大数据& AI 产品技术月刊【2025年 8 月】,涵盖 8 月技术速递、产品和功能发布、市场和客户应用实践等内容,帮助您快速了解阿里云大数据& AI 方面最新动态。
379 2
|
7月前
|
人工智能 分布式计算 DataWorks
大数据& AI 产品月刊【2025年5月】
大数据& AI 产品技术月刊【2025年5月】,涵盖5月技术速递、产品和功能发布、市场和客户应用实践等内容,帮助您快速了解阿里云大数据& AI 方面最新动态。