Python 正则表达式实战之Java日志解析

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
公共DNS(含HTTPDNS解析),每月1000万次HTTP解析
简介: Python 正则表达式实战之Java日志解析

需求描述

基于生产监控告警需求,需要对Java日志进行解析,提取相关信息,作为告警通知消息的内容部分。

提取思路

具体怎么提取,提取哪些内容呢?这里笔者分析了大量不同形态的生产日志,最后总结出4种形态,如下,制定出以下提取逻辑。

形态1

上图中,款选部分即为要提取的主要内容,即异常发生时所在文件,代码行,自定义异常相关描述,异常类型,异常描述,这里提取的相关说明和异常描述将统一作为异常的详细描述

形态2

类似形态1,如果没有独占一行的“异常类型”,那就取最后Caused by:后面的异常类型,及其描述

形态3

形态1,形态2不匹配的情况下,匹配形态3,该形态中,异常类型和描述是包含在自定义异常相关描述里面的

形态4

前三者都不匹配的情况下,匹配最后这种形态。没有异常类型,仅日志级别“ERROR”可以标识它是条异常日志。

代码实现

#!/usr/bin/env python
#-*- coding:utf-8 -*-
import re
log_list = [
'''
2021-10-18 09:22:41,079:ERROR http-nio-9330-exec-4 (DirectJDKLog.java:181) - Servlet.service() for servlet [dispatcherServlet] in context with path [/finance] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause
java.lang.NullPointerException
  at java.util.Comparator.lambda$comparing$77a9974f$1(Comparator.java:469) ~[?:1.8.0_202]
  at java.util.TreeMap.put(TreeMap.java:552) ~[?:1.8.0_202]
''',
'''
2021-10-18 09:22:55,222:WARN kafka-async-consumer-2 (FeignClientsErrorDecoder.java:43) - read Exception failed!
com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
    at [Source: java.io.InputStreamReader@743333a3; line: 1, column: 0]
  at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:270) ~[jackson-databind-2.8.4.jar!/:2.8.4]
''',
'''
2021-10-18 09:22:52,975:ERROR [,] parallel-2 (AccessLogWebFilter.java:60) - [accessId=616ccc49ff642e00010a4e8c] 发生网关内部错误
org.springframework.web.server.ResponseStatusException: 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT35S"; nested exception is org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S
  at org.springframework.cloud.gateway.filter.NettyRoutingFilter.lambda$filter$5(NettyRoutingFilter.java:211) ~[spring-cloud-gateway-core-2.1.3.RELEASE.jar!/:2.1.3.RELEASE]
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
Caused by: org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S
    at
''',
'''
2021-10-18 09:22:41,905:WARN http-nio-8080-exec-60 (VehicleOeImpl.java:1000) - 批量更新第三方价格失败1---->
org.springframework.jdbc.BadSqlGrammarException:
### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline
### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline
### The error occurred while setting parameters
### SQL:
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
; bad SQL grammar []; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
  at org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:91)
  at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
  at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.lang.Thread.run(Thread.java:748)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
  at java.lang.reflect.Method.invoke(Method.java:498)
  at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:434)
  ... 70 more
''',
'''
2021-10-17 18:39:33,066:ERROR http-nio-10062-exec-34 TID: 962118fb93d345bc92af98499ad0f771.3235.16344671730621817 (DirectJDKLog.java:181) - Servlet.service() for servlet [dispatcherServlet] in context with path [/orders/seller] threw exception [Request processing failed; nested exception is com.cmall.commons.service.exception.HttpMessageException: [400]标名为空] with root cause
Exception: 标名为空
  at com.cmall.commons.utils.Assert.fail(Assert.java:553) ~[icec-cloud-commons-0.4.5.jar!/:?]
  at com.cmall.commons.utils.Assert.notBlank(Assert.java:112) ~[icec-cloud-commons-0.4.5.jar!/:?]
''',
'''
2021-10-18 09:22:23,849:ERROR http-nio-10030-exec-2 TID: ed41cdfb8d5d4953a713285802c56032.80.16345201436864709 (DicountAssembler.java:266) - 查询商品类优惠结果失败:DiscountProductRequest(companyId=IYl6MgdkiG9KoBMYUJo, userLoginId=5c385563ad996c47bf5f7ccd, provinceGeoId=CN-43, cityGeoId=284)
feign.FeignException: status 500 reading DiscountProductClient#listDiscountProducts(DiscountProductRequest); content:
{"timestamp":1634520143844,"status":500,"error":"Internal Server Error","exception":"com.cmall.commons.service.exception.HttpMessageException","message":"优惠前的不含税价格不能为空","path":"/discountPromotion/listDiscountProducts"}
  at feign.FeignException.errorStatus(FeignException.java:62) ~[feign-core-9.3.1.jar!/:?]
  at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
''',
'''2021-10-18 09:22:23,849:ERROR [-]  (DicountAssembler.java:266) - task supervisor threw an exception ....''',
'''
2021-10-18 09:13:13,940:ERROR kafka-async-consumer-9 (ConsumeSupport.java:104) - kafka消费失败, cid:616cca245cc0a90001ead690, message:{"jsonMessageType":"com.cmall.ec.cloud.scheduletask.values.kafka.command.DistributedDelayCommand","id":"616cca24d52d1d00010531dd","usage":"EVENT","service":"schedule-task-service","topic":"prod-quote-command-delay","timeStamp":1634519588985,"delayTimes":16983,"producerTaskId":"B21101807668","inquiryId":"B21101807668","resolveBatchId":"616cca019cf7b70001c35477","resolveIds":["616cca00d52d1d000138be9a"],"type":"AUTO","retryCount":2,"producer":"DistributedDelayCommand"}, cause:java.lang.reflect.InvocationTargetException
  at sun.reflect.GeneratedMethodAccessor4946.invoke(Unknown Source)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at java.lang.Thread.run(Thread.java:748)
Caused by: com.cmall.messagebus.exception.MessageBaseException: 系统自动报价失败-->org.springframework.dao.DuplicateKeyException:
### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
; SQL []; Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
  at com.cmall.ec.cloud.quotation.handler.QuotationAllocationHandler.intelligentAutoQuoteDispatcher(QuotationAllocationHandler.java:80)
  ... 9 more
''',
'''
2021-10-16 14:37:19,951:ERROR DiscoveryClient-1 (TimedSupervisorTask.java:79) - task supervisor threw an exception
java.lang.OutOfMemoryError: Java heap space
''',
'''2021-10-16 14:37:19,951:ERROR DiscoveryClient-1 (TimedSupervisorTask.java:79) - task supervisor threw an exception
java.lang.OutOfMemoryError: Java heap space
at''',
'''
2021-10-23 14:03:00,785:ERROR kafka-async-consumer-1 (ConsumeSupport.java:104) - kafka消费失败, cid:6173a593f90d4200010ce3fe, message:{"jsonMessageType":"com.cmall.ec.cloud.events.order.OrderSent","id":"6173a593cb769e0001402a3b","usage":"EVENT","service":"orders-service","topic":"prod-order","timeStamp":1634968979938,"orderId":"S2110230001835","shipmentType":"LOGISTICS","logisticsCompany":"快送","shipmentNum":"","shipGroupId":"6173a59355c35d00014c390c","remark":"","productStoreId":"SZQD0001","userLoginId":"5e0aa6649cf5260001336437","username":"许庆杰","items":[{"productId":"00001","quantity":1}],"isLogisticsFeePayOnLine":true}, cause:java.lang.reflect.InvocationTargetException
  at sun.reflect.GeneratedMethodAccessor1909.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:498)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  at java.lang.Thread.run(Thread.java:748)
Caused by: com.baomidou.mybatisplus.exceptions.MybatisPlusException: Error: Cannot execute insertBatch Method. Cause
  at com.baomidou.mybatisplus.service.impl.ServiceImpl.insertBatch(ServiceImpl.java:137)
  at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
  ... 9 more
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error flushing statements.  Cause: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
### Cause: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
  at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
  ... 51 more
Caused by: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
  at java.lang.reflect.Method.invoke(Method.java:498)
  ... 52 more
Caused by: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
  at sun.reflect.GeneratedConstructorAccessor1540.newInstance(Unknown Source)
  ... 61 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
  at sun.reflect.GeneratedConstructorAccessor1537.newInstance(Unknown Source)
  ... 71 more
''',
'''
2021-10-25 16:29:15,853:ERROR reactor-http-epoll-3 (CompositeLog.java:122) - 500 Server Error for HTTP POST "/job-service/api/registry"
reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
''',
'''2021-11-06 07:09:04,781:WARN http-nio-10011-exec-85 (AdminService.java:97) - 任务执行失败, code:500, reason:java.lang.OutOfMemoryError: Java heap space''',
'''2022-01-08 13:46:30,668:ERROR http-nio-9524-exec-5 (WaitSettleDealServiceImpl.java:147) - 添加退订单到账单失败com.cmall.commons.service.exception.HttpMessageException: [411]退货单添加失败,该分组已经结束对账,无法添加退货单'''
]
exception_match_pattern_list = [
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)(.*)\n([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)(.*?)(\s+at\s|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\).*([^\n]*Caused by: )([^:\s>]*?Exception|[^:\s]*?Error)(.*?)(\s+at\s*|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)([^\n]*?)([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)([^\n]*?)(\s+at\s|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)(.*?)\n*?\s*?([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)*?(.*?)(\s+at\s|$)'
]
for log_index, log in enumerate(log_list):
    flag = 0
    for pattern_index, flag_pattern in enumerate(exception_match_pattern_list):
        match_result = re.findall(flag_pattern, log, re.DOTALL)
        if match_result:
            print('匹配第%s个Pattern' % (pattern_index+1), '匹配结果:', match_result[0])
            flag = 1
            break
    if not flag:
        print('第%s条日志,不匹配任何正则表达式' % (log_index + 1))

提取效果

匹配第1个Pattern 匹配结果: ('ERROR', 'DirectJDKLog.java', '181', ' - Servlet.service() for servlet [dispatcherServlet] in context with path [/finance] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause', 'java.lang.NullPointerException', '', '\n\tat ')
匹配第1个Pattern 匹配结果: ('WARN', 'FeignClientsErrorDecoder.java', '43', ' - read Exception failed!', 'com.fasterxml.jackson.databind.JsonMappingException', ': No content to map due to end-of-input', '\n    at ')
匹配第1个Pattern 匹配结果: ('ERROR', 'AccessLogWebFilter.java', '60', ' - [accessId=616ccc49ff642e00010a4e8c] 发生网关内部错误', 'org.springframework.web.server.ResponseStatusException', ': 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT35S"; nested exception is org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S', '\n\tat ')
匹配第1个Pattern 匹配结果: ('WARN', 'VehicleOeImpl.java', '1000', ' - 批量更新第三方价格失败1---->', 'org.springframework.jdbc.BadSqlGrammarException', ':\n### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty\n### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline\n### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline\n### The error occurred while setting parameters\n### SQL:\n### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty\n; bad SQL grammar []; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty', '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'DirectJDKLog.java', '181', ' - Servlet.service() for servlet [dispatcherServlet] in context with path [/orders/seller] threw exception [Request processing failed; nested exception is com.cmall.commons.service.exception.HttpMessageException: [400]标名为空] with root cause', 'Exception', ': 标名为空', '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'DicountAssembler.java', '266', ' - 查询商品类优惠结果失败:DiscountProductRequest(companyId=IYl6MgdkiG9KoBMYUJo, userLoginId=5c385563ad996c47bf5f7ccd, provinceGeoId=CN-43, cityGeoId=284)', 'feign.FeignException', ': status 500 reading DiscountProductClient#listDiscountProducts(DiscountProductRequest); content:\n{"timestamp":1634520143844,"status":500,"error":"Internal Server Error","exception":"com.cmall.commons.service.exception.HttpMessageException","message":"优惠前的不含税价格不能为空","path":"/discountPromotion/listDiscountProducts"}', '\n\tat ')
匹配第4个Pattern 匹配结果: ('ERROR', 'DicountAssembler.java', '266', '', '', ' - task supervisor threw an exception ....', '')
匹配第2个Pattern 匹配结果: ('ERROR', 'ConsumeSupport.java', '104', 'Caused by: ', 'com.cmall.messagebus.exception.MessageBaseException', ": 系统自动报价失败-->org.springframework.dao.DuplicateKeyException:\n### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'\n### The error may involve defaultParameterMap\n### The error occurred while setting parameters\n### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'\n; SQL []; Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'", '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'TimedSupervisorTask.java', '79', ' - task supervisor threw an exception', 'java.lang.OutOfMemoryError', ': Java heap space', '')
匹配第1个Pattern 匹配结果: ('ERROR', 'TimedSupervisorTask.java', '79', ' - task supervisor threw an exception', 'java.lang.OutOfMemoryError', ': Java heap space\nat', '')
匹配第2个Pattern 匹配结果: ('ERROR', 'ConsumeSupport.java', '104', 'Caused by: ', 'com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException', ": Duplicate entry 'SF2110230005145' for key 'PRIMARY'", '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'CompositeLog.java', '122', ' - 500 Server Error for HTTP POST "/job-service/api/registry"', 'reactor.netty.http.client.PrematureCloseException', ': Connection prematurely closed BEFORE response', '')
匹配第3个Pattern 匹配结果: ('WARN', 'AdminService.java', '97', ' - 任务执行失败, code:500, reason:', 'java.lang.OutOfMemoryError', ': Java heap space', '')
匹配第3个Pattern 匹配结果: ('ERROR', 'WaitSettleDealServiceImpl.java', '147', ' - 添加退订单到账单失败', 'com.cmall.commons.service.exception.HttpMessageException', ': [411]退货单添加失败,该分组已经结束对账,无法添加退货单', '')
相关实践学习
日志服务之使用Nginx模式采集日志
本文介绍如何通过日志服务控制台创建Nginx模式的Logtail配置快速采集Nginx日志并进行多维度分析。
目录
相关文章
|
3天前
|
Java Maven Spring
超实用的SpringAOP实战之日志记录
【11月更文挑战第11天】本文介绍了如何使用 Spring AOP 实现日志记录功能。首先概述了日志记录的重要性及 Spring AOP 的优势,然后详细讲解了搭建 Spring AOP 环境、定义日志切面、优化日志内容和格式的方法,最后通过测试验证日志记录功能的准确性和完整性。通过这些步骤,可以有效提升系统的可维护性和可追踪性。
|
6天前
|
数据采集 机器学习/深度学习 人工智能
Python编程入门:从基础到实战
【10月更文挑战第36天】本文将带你走进Python的世界,从基础语法出发,逐步深入到实际项目应用。我们将一起探索Python的简洁与强大,通过实例学习如何运用Python解决问题。无论你是编程新手还是希望扩展技能的老手,这篇文章都将为你提供有价值的指导和灵感。让我们一起开启Python编程之旅,用代码书写想法,创造可能。
|
8天前
|
数据库 Python
异步编程不再难!Python asyncio库实战,让你的代码流畅如丝!
在编程中,随着应用复杂度的提升,对并发和异步处理的需求日益增长。Python的asyncio库通过async和await关键字,简化了异步编程,使其变得流畅高效。本文将通过实战示例,介绍异步编程的基本概念、如何使用asyncio编写异步代码以及处理多个异步任务的方法,帮助你掌握异步编程技巧,提高代码性能。
26 4
|
7天前
|
机器学习/深度学习 数据可视化 数据处理
Python数据科学:从基础到实战
Python数据科学:从基础到实战
13 1
|
8天前
|
机器学习/深度学习 JSON API
Python编程实战:构建一个简单的天气预报应用
Python编程实战:构建一个简单的天气预报应用
19 1
|
9天前
|
算法 Python
Python 大神修炼手册:图的深度优先&广度优先遍历,深入骨髓的解析
在 Python 编程中,掌握图的深度优先遍历(DFS)和广度优先遍历(BFS)是进阶的关键。这两种算法不仅理论重要,还能解决实际问题。本文介绍了图的基本概念、邻接表表示方法,并给出了 DFS 和 BFS 的 Python 实现代码示例,帮助读者深入理解并应用这些算法。
21 2
|
11天前
|
前端开发 API 开发者
Python Web开发者必看!AJAX、Fetch API实战技巧,让前后端交互如丝般顺滑!
在Web开发中,前后端的高效交互是提升用户体验的关键。本文通过一个基于Flask框架的博客系统实战案例,详细介绍了如何使用AJAX和Fetch API实现不刷新页面查看评论的功能。从后端路由设置到前端请求处理,全面展示了这两种技术的应用技巧,帮助Python Web开发者提升项目质量和开发效率。
26 1
|
11天前
|
缓存 测试技术 Apache
告别卡顿!Python性能测试实战教程,JMeter&Locust带你秒懂性能优化💡
告别卡顿!Python性能测试实战教程,JMeter&Locust带你秒懂性能优化💡
26 1
|
3天前
|
数据采集 存储 数据处理
探索Python中的异步编程:从基础到实战
【10月更文挑战第39天】在编程世界中,时间就是效率的代名词。Python的异步编程特性,如同给程序穿上了一双翅膀,让它们在执行任务时飞得更高、更快。本文将带你领略Python异步编程的魅力,从理解其背后的原理到掌握实际应用的技巧,我们不仅会讨论理论基础,还会通过实际代码示例,展示如何利用这些知识来提升你的程序性能。准备好让你的Python代码“起飞”了吗?让我们开始这场异步编程的旅程!
10 0
|
7天前
|
并行计算 数据挖掘 大数据
Python数据分析实战:利用Pandas处理大数据集
Python数据分析实战:利用Pandas处理大数据集