Python 正则表达式实战之Java日志解析

需求描述

基于生产监控告警需求,需要对Java日志进行解析,提取相关信息,作为告警通知消息的内容部分。

提取思路

如何提取,提取哪些内容?在这里,作者分析了大量不同形式的生产日志,最后总结了四种形式,如下,制定了以下提取逻辑。

[En]

How to extract and what content to extract? Here, the author analyzes a large number of different forms of production logs, and finally summarizes four forms, as follows, formulate the following extraction logic.

形态1

Python 正则表达式实战之Java日志解析

在上图中,选择部分是要提取的主要内容,即发生异常的文件、代码行、自定义异常描述、异常类型和异常描述。此处提取的相关描述和异常描述将统一为异常的详细描述。

[En]

In the above figure, the selection part is the main content to be extracted, that is, the file in which the exception occurs, the line of code, the custom exception description, the exception type, and the exception description. The relevant description and exception description extracted here will be unified as the detailed description of the exception.

形态2

Python 正则表达式实战之Java日志解析

类似形态1,如果没有独占一行的”异常类型”,那就取最后 _Caused by:_后面的异常类型,及其描述

形态3

Python 正则表达式实战之Java日志解析

形态1,形态2不匹配的情况下,匹配形态3,该形态中,异常类型和描述是包含在自定义异常相关描述里面的

形态4

Python 正则表达式实战之Java日志解析

前三者都不匹配的情况下,匹配最后这种形态。没有异常类型,仅日志级别”ERROR”可以标识它是条异常日志。

代码实现

#!/usr/bin/env python
#-*- coding:utf-8 -*-

import re

log_list = [
'''
2021-10-18 09:22:41,079:ERROR http-nio-9330-exec-4 (DirectJDKLog.java:181) - Servlet.service() for servlet [dispatcherServlet] in context with path [/finance] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause
java.lang.NullPointerException
    at java.util.Comparator.lambda$comparing$77a9974f$1(Comparator.java:469) ~[?:1.8.0_202]
    at java.util.TreeMap.put(TreeMap.java:552) ~[?:1.8.0_202]
''',
'''
2021-10-18 09:22:55,222:WARN kafka-async-consumer-2 (FeignClientsErrorDecoder.java:43) - read Exception failed!

com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
    at [Source: java.io.InputStreamReader@743333a3; line: 1, column: 0]
    at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:270) ~[jackson-databind-2.8.4.jar!/:2.8.4]
''',
'''
2021-10-18 09:22:52,975:ERROR [,] parallel-2 (AccessLogWebFilter.java:60) - [accessId=616ccc49ff642e00010a4e8c] 发生网关内部错误
org.springframework.web.server.ResponseStatusException: 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT35S"; nested exception is org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S
    at org.springframework.cloud.gateway.filter.NettyRoutingFilter.lambda$filter$5(NettyRoutingFilter.java:211) ~[spring-cloud-gateway-core-2.1.3.RELEASE.jar!/:2.1.3.RELEASE]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
Caused by: org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S
    at
''',
'''
2021-10-18 09:22:41,905:WARN http-nio-8080-exec-60 (VehicleOeImpl.java:1000) - 批量更新第三方价格失败1---->
org.springframework.jdbc.BadSqlGrammarException:
### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline
### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline
### The error occurred while setting parameters
### SQL:
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
; bad SQL grammar []; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
    at org.springframework.jdbc.support.SQLExceptionSubclassTranslator.doTranslate(SQLExceptionSubclassTranslator.java:91)
    at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:73)
    at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:434)
    ... 70 more
''',
'''
2021-10-17 18:39:33,066:ERROR http-nio-10062-exec-34 TID: 962118fb93d345bc92af98499ad0f771.3235.16344671730621817 (DirectJDKLog.java:181) - Servlet.service() for servlet [dispatcherServlet] in context with path [/orders/seller] threw exception [Request processing failed; nested exception is com.cmall.commons.service.exception.HttpMessageException: [400]标名为空] with root cause
Exception: 标名为空
    at com.cmall.commons.utils.Assert.fail(Assert.java:553) ~[icec-cloud-commons-0.4.5.jar!/:?]
    at com.cmall.commons.utils.Assert.notBlank(Assert.java:112) ~[icec-cloud-commons-0.4.5.jar!/:?]
''',
'''
2021-10-18 09:22:23,849:ERROR http-nio-10030-exec-2 TID: ed41cdfb8d5d4953a713285802c56032.80.16345201436864709 (DicountAssembler.java:266) - 查询商品类优惠结果失败:DiscountProductRequest(companyId=IYl6MgdkiG9KoBMYUJo, userLoginId=5c385563ad996c47bf5f7ccd, provinceGeoId=CN-43, cityGeoId=284)
feign.FeignException: status 500 reading DiscountProductClient#listDiscountProducts(DiscountProductRequest); content:
{"timestamp":1634520143844,"status":500,"error":"Internal Server Error","exception":"com.cmall.commons.service.exception.HttpMessageException","message":"优惠前的不含税价格不能为空","path":"/discountPromotion/listDiscountProducts"}
    at feign.FeignException.errorStatus(FeignException.java:62) ~[feign-core-9.3.1.jar!/:?]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
''',
'''2021-10-18 09:22:23,849:ERROR [-]  (DicountAssembler.java:266) - task supervisor threw an exception ....''',
'''
2021-10-18 09:13:13,940:ERROR kafka-async-consumer-9 (ConsumeSupport.java:104) - kafka消费失败, cid:616cca245cc0a90001ead690, message:{"jsonMessageType":"com.cmall.ec.cloud.scheduletask.values.kafka.command.DistributedDelayCommand","id":"616cca24d52d1d00010531dd","usage":"EVENT","service":"schedule-task-service","topic":"prod-quote-command-delay","timeStamp":1634519588985,"delayTimes":16983,"producerTaskId":"B21101807668","inquiryId":"B21101807668","resolveBatchId":"616cca019cf7b70001c35477","resolveIds":["616cca00d52d1d000138be9a"],"type":"AUTO","retryCount":2,"producer":"DistributedDelayCommand"}, cause:java.lang.reflect.InvocationTargetException
    at sun.reflect.GeneratedMethodAccessor4946.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.cmall.messagebus.exception.MessageBaseException: 系统自动报价失败-->org.springframework.dao.DuplicateKeyException:
### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
### The error may involve defaultParameterMap
### The error occurred while setting parameters
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
; SQL []; Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'
    at com.cmall.ec.cloud.quotation.handler.QuotationAllocationHandler.intelligentAutoQuoteDispatcher(QuotationAllocationHandler.java:80)
    ... 9 more
''',
'''
2021-10-16 14:37:19,951:ERROR DiscoveryClient-1 (TimedSupervisorTask.java:79) - task supervisor threw an exception
java.lang.OutOfMemoryError: Java heap space
''',
'''2021-10-16 14:37:19,951:ERROR DiscoveryClient-1 (TimedSupervisorTask.java:79) - task supervisor threw an exception
java.lang.OutOfMemoryError: Java heap space
at''',
'''
2021-10-23 14:03:00,785:ERROR kafka-async-consumer-1 (ConsumeSupport.java:104) - kafka消费失败, cid:6173a593f90d4200010ce3fe, message:{"jsonMessageType":"com.cmall.ec.cloud.events.order.OrderSent","id":"6173a593cb769e0001402a3b","usage":"EVENT","service":"orders-service","topic":"prod-order","timeStamp":1634968979938,"orderId":"S2110230001835","shipmentType":"LOGISTICS","logisticsCompany":"快送","shipmentNum":"","shipGroupId":"6173a59355c35d00014c390c","remark":"","productStoreId":"SZQD0001","userLoginId":"5e0aa6649cf5260001336437","username":"许庆杰","items":[{"productId":"00001","quantity":1}],"isLogisticsFeePayOnLine":true}, cause:java.lang.reflect.InvocationTargetException
    at sun.reflect.GeneratedMethodAccessor1909.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: com.baomidou.mybatisplus.exceptions.MybatisPlusException: Error: Cannot execute insertBatch Method. Cause
    at com.baomidou.mybatisplus.service.impl.ServiceImpl.insertBatch(ServiceImpl.java:137)
    at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
    ... 9 more
Caused by: org.apache.ibatis.exceptions.PersistenceException:
### Error flushing statements.  Cause: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
### Cause: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
    at org.apache.ibatis.exceptions.ExceptionFactory.wrapException(ExceptionFactory.java:30)
    ... 51 more
Caused by: org.apache.ibatis.executor.BatchExecutorException: com.cmall.ec.cloud.service.dao.mapper.ShipFeeMapper.insert (batch index #1) failed. Cause: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
    at java.lang.reflect.Method.invoke(Method.java:498)
    ... 52 more
Caused by: java.sql.BatchUpdateException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
    at sun.reflect.GeneratedConstructorAccessor1540.newInstance(Unknown Source)
    ... 61 more
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'SF2110230005145' for key 'PRIMARY'
    at sun.reflect.GeneratedConstructorAccessor1537.newInstance(Unknown Source)
    ... 71 more
''',
'''
2021-10-25 16:29:15,853:ERROR reactor-http-epoll-3 (CompositeLog.java:122) - 500 Server Error for HTTP POST "/job-service/api/registry"
reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
''',
'''2021-11-06 07:09:04,781:WARN http-nio-10011-exec-85 (AdminService.java:97) - 任务执行失败, code:500, reason:java.lang.OutOfMemoryError: Java heap space''',
'''2022-01-08 13:46:30,668:ERROR http-nio-9524-exec-5 (WaitSettleDealServiceImpl.java:147) - 添加退订单到账单失败com.cmall.commons.service.exception.HttpMessageException: [411]退货单添加失败,该分组已经结束对账,无法添加退货单'''
]

exception_match_pattern_list = [
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)(.*)\n([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)(.*?)(\s+at\s|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\).*([^\n]*Caused by: )([^:\s>]*?Exception|[^:\s]*?Error)(.*?)(\s+at\s*|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)([^\n]*?)([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)([^\n]*?)(\s+at\s|$)',
    ':(ERROR|WARN) .+\s\(([^\s]+?\.java):(\d+)\)(.*?)\n*?\s*?([^:\s>\u4e00-\u9fa5]*Exception|[^:\s>\u4e00-\u9fa5]*Error)*?(.*?)(\s+at\s|$)'
]

for log_index, log in enumerate(log_list):
    flag = 0
    for pattern_index, flag_pattern in enumerate(exception_match_pattern_list):
        match_result = re.findall(flag_pattern, log, re.DOTALL)
        if match_result:
            print('匹配第%s个Pattern' % (pattern_index+1), '匹配结果:', match_result[0])
            flag = 1
            break
    if not flag:
        print('第%s条日志,不匹配任何正则表达式' % (log_index + 1))

提取效果

匹配第1个Pattern 匹配结果: ('ERROR', 'DirectJDKLog.java', '181', ' - Servlet.service() for servlet [dispatcherServlet] in context with path [/finance] threw exception [Request processing failed; nested exception is java.lang.NullPointerException] with root cause', 'java.lang.NullPointerException', '', '\n\tat ')
匹配第1个Pattern 匹配结果: ('WARN', 'FeignClientsErrorDecoder.java', '43', ' - read Exception failed!', 'com.fasterxml.jackson.databind.JsonMappingException', ': No content to map due to end-of-input', '\n    at ')
匹配第1个Pattern 匹配结果: ('ERROR', 'AccessLogWebFilter.java', '60', ' - [accessId=616ccc49ff642e00010a4e8c] 发生网关内部错误', 'org.springframework.web.server.ResponseStatusException', ': 504 GATEWAY_TIMEOUT "Response took longer than timeout: PT35S"; nested exception is org.springframework.cloud.gateway.support.TimeoutException: Response took longer than timeout: PT35S', '\n\tat ')
匹配第1个Pattern 匹配结果: ('WARN', 'VehicleOeImpl.java', '1000', ' - 批量更新第三方价格失败1---->', 'org.springframework.jdbc.BadSqlGrammarException', ':\n### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty\n### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline\n### The error may involve com.cmall.ec.webapp.maindata.web.dao.vehicleOe.ThirdPartyOeMapper.updateList-Inline\n### The error occurred while setting parameters\n### SQL:\n### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty\n; bad SQL grammar []; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query was empty', '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'DirectJDKLog.java', '181', ' - Servlet.service() for servlet [dispatcherServlet] in context with path [/orders/seller] threw exception [Request processing failed; nested exception is com.cmall.commons.service.exception.HttpMessageException: [400]标名为空] with root cause', 'Exception', ': 标名为空', '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'DicountAssembler.java', '266', ' - 查询商品类优惠结果失败:DiscountProductRequest(companyId=IYl6MgdkiG9KoBMYUJo, userLoginId=5c385563ad996c47bf5f7ccd, provinceGeoId=CN-43, cityGeoId=284)', 'feign.FeignException', ': status 500 reading DiscountProductClient#listDiscountProducts(DiscountProductRequest); content:\n{"timestamp":1634520143844,"status":500,"error":"Internal Server Error","exception":"com.cmall.commons.service.exception.HttpMessageException","message":"优惠前的不含税价格不能为空","path":"/discountPromotion/listDiscountProducts"}', '\n\tat ')
匹配第4个Pattern 匹配结果: ('ERROR', 'DicountAssembler.java', '266', '', '', ' - task supervisor threw an exception ....', '')
匹配第2个Pattern 匹配结果: ('ERROR', 'ConsumeSupport.java', '104', 'Caused by: ', 'com.cmall.messagebus.exception.MessageBaseException', ": 系统自动报价失败-->org.springframework.dao.DuplicateKeyException:\n### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'\n### The error may involve defaultParameterMap\n### The error occurred while setting parameters\n### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'\n; SQL []; Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '616cca225cc0a90001e1d0d2' for key 'PRIMARY'", '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'TimedSupervisorTask.java', '79', ' - task supervisor threw an exception', 'java.lang.OutOfMemoryError', ': Java heap space', '')
匹配第1个Pattern 匹配结果: ('ERROR', 'TimedSupervisorTask.java', '79', ' - task supervisor threw an exception', 'java.lang.OutOfMemoryError', ': Java heap space\nat', '')
匹配第2个Pattern 匹配结果: ('ERROR', 'ConsumeSupport.java', '104', 'Caused by: ', 'com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException', ": Duplicate entry 'SF2110230005145' for key 'PRIMARY'", '\n\tat ')
匹配第1个Pattern 匹配结果: ('ERROR', 'CompositeLog.java', '122', ' - 500 Server Error for HTTP POST "/job-service/api/registry"', 'reactor.netty.http.client.PrematureCloseException', ': Connection prematurely closed BEFORE response', '')
匹配第3个Pattern 匹配结果: ('WARN', 'AdminService.java', '97', ' - 任务执行失败, code:500, reason:', 'java.lang.OutOfMemoryError', ': Java heap space', '')
匹配第3个Pattern 匹配结果: ('ERROR', 'WaitSettleDealServiceImpl.java', '147', ' - 添加退订单到账单失败', 'com.cmall.commons.service.exception.HttpMessageException', ': [411]退货单添加失败,该分组已经结束对账,无法添加退货单', '')

Original: https://www.cnblogs.com/shouke/p/15807375.html
Author: 授客
Title: Python 正则表达式实战之Java日志解析

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/9868/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

发表回复

登录后才能评论
免费咨询
免费咨询
扫码关注
扫码关注
联系站长

站长Johngo!

大数据和算法重度研究者!

持续产出大数据、算法、LeetCode干货,以及业界好资源!

2022012703491714

微信来撩,免费咨询:xiaozhu_tec

分享本页
返回顶部