该文章是我发布在某sdn上的,搬运过来。
Python 实现 Unix 'tail' 命令的完整解决方案
在本文中,我们将探讨如何用 Python 编写一个类似于 Unix 中 'tail -f' 功能的库。此库能够监测文件的变化,并在文件有新行添加时执行相应的操作。我们还将修复在特定场景下的 bug,确保代码的健壮性和实用性。
一、初始版本与问题
最初的 Python 'tail' 库由一个斯里兰卡程序员创建并发布在 GitHub 上,
https://github.com/kasun/python-tail。
第二版作者:http://www.cnblogs.com/bufferfly/p/4878688.html 。
然而,经过实际应用,发现存在一些问题,特别是当被监控的文件在监控过程中被清空或因日志轮转而重命名时,库会失去追踪能力。
二、问题分析与修复
在原始代码中,_size = os.path.getsize(self.tailed_file)
这一行代码可能抛出异常,因为被监控的文件可能在某些时刻不存在。这会导致监控程序意外终止,影响了其持续监控的能力。为了修复这个问题,我们在获取文件大小的操作周围添加了异常处理逻辑,并在文件不存在时引入了重试机制。
三、优化后的代码示例
下面是优化后的代码示例,包括异常处理和重试机制,以及如何注册回调函数和开始监控文件的流程。
#!-*- coding: utf-8 -*- ################################################################################ # # Copyright (c) 2015 XX.com, Inc. All Rights Reserved # ################################################################################ ################################################################################ # This module provide ... # Third libs those depend on: ################################################################################ """ Compiler Python 2.7.10 Authors: xingxinghuo1000@163.com Date: 2017-07-31 Desc:类似于tail命令的python lib , 可以注册一个回调函数,每读取一行,则触发一次回调 Modify: 1、修复一个bug, _size=os.path.getsize 那一行,会出现异常,提示文件不存在,加了try和等待 """ """SYS LIBS """ import os import re import sys import time """THIRD LIBS """ try: # import the third libs there. pass except ImportError as e: print e os._exit(-1) """CUSTOM libs Strongly recommend using abs path when using custmo libs. """ # Good behaviors. # It means refusing called like from xxx import * # When `__all__` is [] __all__ = [] reload(sys) sys.setdefaultencoding('utf-8') def send_error(msg): """ Send error to email. """ print msg #******************************************************** #* Global defines start. * #******************************************************** #******************************************************** #* Global defines end. * #******************************************************** class Tail(object): """ Python-Tail - Unix tail follow implementation in Python. python-tail can be used to monitor changes to a file. Example: import tail # Create a tail instance t = tail.Tail('file-to-be-followed') # Register a callback function to be called when a new line is found in the followed file. # If no callback function is registerd, new lines would be printed to standard out. t.register_callback(callback_function) # Follow the file with 5 seconds as sleep time between iterations. # If sleep time is not provided 1 second is used as the default time. t.follow(s=5) """ ''' Represents a tail command. ''' def __init__(self, tailed_file): ''' Initiate a Tail instance. Check for file validity, assigns callback function to standard out. Arguments: tailed_file - File to be followed. ''' self.check_file_validity(tailed_file) self.tailed_file = tailed_file self.callback = sys.stdout.write self.try_count = 0 try: self.file_ = open(self.tailed_file, "r") self.size = os.path.getsize(self.tailed_file) # Go to the end of file self.file_.seek(0, 2) except: raise def reload_tailed_file(self): """ Reload tailed file when it be empty be `echo "" > tailed file`, or segmentated by logrotate. """ try: self.file_ = open(self.tailed_file, "r") self.size = os.path.getsize(self.tailed_file) # Go to the head of file self.file_.seek(0, 1) return True except: return False def wait_file_get_size(self, file): while 1: try: _size = os.path.getsize(self.tailed_file) return _size except: print("Error when getsize of tailed_file") time.sleep(0.1) continue def follow(self, s=0.01): """ Do a tail follow. If a callback function is registered it is called with every new line. Else printed to standard out. Arguments: s - Number of seconds to wait between each iteration; Defaults to 1. """ while True: _size = self.wait_file_get_size(self.tailed_file) if _size < self.size: while self.try_count < 10: if not self.reload_tailed_file(): self.try_count += 1 else: self.try_count = 0 self.size = os.path.getsize(self.tailed_file) break time.sleep(0.1) if self.try_count == 10: raise Exception("Open %s failed after try 10 times" % self.tailed_file) else: self.size = _size curr_position = self.file_.tell() line = self.file_.readline() if not line: self.file_.seek(curr_position) elif not line.endswith("\n"): self.file_.seek(curr_position) else: self.callback(line) time.sleep(s) def register_callback(self, func): """ Overrides default callback function to provided function. """ self.callback = func def check_file_validity(self, file_): """ Check whether the a given file exists, readable and is a file """ if not os.access(file_, os.F_OK): raise TailError("File '%s' does not exist" % (file_)) if not os.access(file_, os.R_OK): raise TailError("File '%s' not readable" % (file_)) if os.path.isdir(file_): raise TailError("File '%s' is a directory" % (file_)) class TailError(Exception): """ Custom error type. """ def __init__(self, msg): """ Init. """ self.message = msg def __str__(self): """ str. """ return self.message if __name__ == '__main__': t = Tail(sys.argv[1]) def print_msg(msg): print msg #print msg.split(']')[0] t.register_callback(filter_and_post) t.follow() """ vim: set ts=4 sw=4 sts=4 tw=100 et: """
四、总结
通过以上的修改,我们的 Python 'tail' 库现在能够更稳定地监控文件变化,即使在文件被清空或日志轮转的情况下也能继续工作。这使得它成为一个可靠的工具,可用于各种需要实时监控文件更新的场景