【Hard Python】【第四章-日志】2、日志消费者Handler的实现

上篇文章里说完了日志实例Logger和日志管理Manager,现在该提到Handler了。Handler是日志信息的消费者,单个Logger实例注册多个Handler,每生成一个LogRecord,就会被合法的Handler消费,在不同地方打印出日志信息。

要研究Handler,首先需要看下基类的实现:

1
2
3
4
5
6
7
8
9
10
class Handler(Filterer):
def handle(self, record):
rv = self.filter(record)
if rv:
self.acquire()
try:
self.emit(record)
finally:
self.release()
return rv

Handler会通过自带的重入锁限制日志记录被串行处理。Handler也是继承Filterer,首先会通过filter过滤日志是否满足Handler的要求,如果合法,然后调用emit方法处理日志。

emit方法在基类是NotImplemented,需要子类加以实现。因此接下来我们具体抽几个例子来看。

StreamHandler

StreamHandler打印日志的实现比较简单:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class StreamHandler(Handler):
terminator = '\n'

def emit(self, record):
try:
msg = self.format(record)
stream = self.stream
# issue 35046: merged two stream.writes into one.
stream.write(msg + self.terminator)
self.flush()
except RecursionError: # See issue 36272
raise
except Exception:
self.handleError(record)

def flush(self):
self.acquire()
try:
if self.stream and hasattr(self.stream, "flush"):
self.stream.flush()
finally:
self.release()

StreamHandler默认的streamsys.stderr。只要我们在basicConfig中给到的stream实例实现了writeflush方法,就可以被放到StreamHandler里头处理日志。日志首先通过format方法被Handler实例的formatter格式化,然后再被输出。

sys.stderr为例,我们来看标准错误流在python中的源码实现。首先需要追溯sys.stderr的来源,我们节选一部分代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
// pylifecycle.c
static PyStatus
init_sys_streams(PyThreadState *tstate)
{
fd = fileno(stderr);
std = create_stdio(config, iomod, fd, 1, "<stderr>",
config->stdio_encoding,
L"backslashreplace");
if (PySys_SetObject("__stderr__", std) < 0) {
Py_DECREF(std);
goto error;
}
if (_PySys_SetObjectId(&PyId_stderr, std) < 0) {
Py_DECREF(std);
goto error;
}
Py_DECREF(std);
}


static PyObject*
create_stdio(const PyConfig *config, PyObject* io,
int fd, int write_mode, const char* name,
const wchar_t* encoding, const wchar_t* errors)
{
buf = _PyObject_CallMethodId(io, &PyId_open, "isiOOOO",
fd, mode, buffering,
Py_None, Py_None, /* encoding, errors */
Py_None, Py_False); /* newline, closefd */
stream = _PyObject_CallMethodId(io, &PyId_TextIOWrapper, "OOOsOO",
buf, encoding_str, errors_str,
newline, line_buffering, write_through);
}

pylifecycle.c中,sys.stderr被设置,最终调用了create_stdio设置了stream实例。其中,会调用io模块的open以及TextIOWrapper方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// _iomodule.c
static PyObject *
_io_open_impl(PyObject *module, PyObject *file, const char *mode,
int buffering, const char *encoding, const char *errors,
const char *newline, int closefd, PyObject *opener)
{
/* Create the Raw file stream */
{
PyObject *RawIO_class = (PyObject *)&PyFileIO_Type;
#ifdef MS_WINDOWS
const PyConfig *config = _Py_GetConfig();
if (!config->legacy_windows_stdio && _PyIO_get_console_type(path_or_fd) != '\0') {
RawIO_class = (PyObject *)&PyWindowsConsoleIO_Type;
encoding = "utf-8";
}
#endif
raw = PyObject_CallFunction(RawIO_class, "OsOO",
path_or_fd, rawmode,
closefd ? Py_True : Py_False,
opener);
}

/* wraps into a buffered file */
{
PyObject *Buffered_class;

if (updating)
Buffered_class = (PyObject *)&PyBufferedRandom_Type;
else if (creating || writing || appending)
Buffered_class = (PyObject *)&PyBufferedWriter_Type; // hits here
else if (reading)
Buffered_class = (PyObject *)&PyBufferedReader_Type;
else {
PyErr_Format(PyExc_ValueError,
"unknown mode: '%s'", mode);
goto error;
}

buffer = PyObject_CallFunction(Buffered_class, "Oi", raw, buffering);
}
if (buffer == NULL)
goto error;
result = buffer;

/* if binary, returns the buffered file */
if (binary) {
Py_DECREF(modeobj);
return result;
}
}

所以总的来看,在windows中,stderr实例的创建流程大致是这样:

  • 创建了一个PyWindowsConsoleIO_Type的实例
    • PyWindowsConsoleIO_Type实质是封装了和windows命令行终端交互的方法
  • 创建了一个PyBufferedWriter_Type实例去实现缓冲输出流
    • 该实例的raw属性即为PyWindowsConsoleIO_Type的实例
  • 创建了一个TextIOWrapper实例封装PyBufferedWriter_Type实例,得到stderrstream实例

当调用stderr.writestderr.flush时,最终会调用BufferedWriterwriteflush。在write过程中会将数据写入缓冲区,而在writeflush过程中都会调用raw.write方法尝试将缓冲区数据刷掉,并写入windows命令行。有兴趣的同学,可以深入研究bufferedio.c里面的内容。

FileHandler

FileHandler继承了StreamHandler,可以将日志记录打印到文件里面,其代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class FileHandler(StreamHandler):
def __init__(self, filename, mode='a', encoding=None, delay=False, errors=None):
filename = os.fspath(filename)
self.baseFilename = os.path.abspath(filename)
self.mode = mode
self.encoding = encoding
if "b" not in mode:
self.encoding = io.text_encoding(encoding)
self.errors = errors
self.delay = delay
self._builtin_open = open
if delay:
Handler.__init__(self)
self.stream = None
else:
StreamHandler.__init__(self, self._open())

def _open(self):
open_func = self._builtin_open
return open_func(self.baseFilename, self.mode,
encoding=self.encoding, errors=self.errors)

def emit(self, record):
if self.stream is None:
self.stream = self._open()
StreamHandler.emit(self, record)

FileHandlerStreamHandler底层提供了一个内置open方法文件流。我们知道这样的文件流也会有write以及flush方法,所以显而易见日志会输出到对应的文件当中。

TimedRotatingFileHandler

TimedRotatingFileHandler也是常用的日志Handler之一,它可以实现按规定的时间间隔打印各个时段的日志到不同文件中。其继承链如下:

1
2
3
4
5
6
7
8
# pprint.pprint(TimedRotatingFileHandler.__mro__)
(<class 'logging.handlers.TimedRotatingFileHandler'>,
<class 'logging.handlers.BaseRotatingHandler'>,
<class 'logging.FileHandler'>,
<class 'logging.StreamHandler'>,
<class 'logging.Handler'>,
<class 'logging.Filterer'>,
<class 'object'>)

首先我们来看BaseRotatingHandler,它继承了FileHandler,其实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class BaseRotatingHandler(logging.FileHandler):
def emit(self, record):
try:
if self.shouldRollover(record):
self.doRollover()
logging.FileHandler.emit(self, record)
except Exception:
self.handleError(record)

def rotate(self, source, dest):
if not callable(self.rotator):
if os.path.exists(source):
os.rename(source, dest)
else:
self.rotator(source, dest)

BaseRotatingHandler在打印日志记录前,首先会通过shouldRollover方法根据日志记录信息判断是否要翻篇,如果需要的话就调用doRollOver方法翻篇,最后再调用FileHandleremit方法。shouldRolloverdoRollover方法都需要子类自己实现。

同时,BaseRotatingHandler也提供了rotate方法调用rotator或者采用默认重命名的方式执行翻篇操作。

接下来我们再看TimedRotatingFileHandler的具体实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
class TimedRotatingFileHandler(BaseRotatingHandler):
def __init__(self, filename, when='h', interval=1, backupCount=0,
encoding=None, delay=False, utc=False, atTime=None,
errors=None):
if os.path.exists(filename):
t = os.stat(filename)[ST_MTIME]
else:
t = int(time.time())
self.rolloverAt = self.computeRollover(t)

def computeRollover(self, currentTime):
result = currentTime + self.interval
if self.when == 'MIDNIGHT' or self.when.startswith('W'):
# 这里忽略掉计算凌晨跟一周内特定日的情况
pass
return result

def shouldRollover(self, record):
t = int(time.time())
if t >= self.rolloverAt:
return 1
return 0

def doRollover(self):
if self.stream:
self.stream.close()
self.stream = None
# get the time that this sequence started at and make it a TimeTuple
currentTime = int(time.time())
dstNow = time.localtime(currentTime)[-1]
t = self.rolloverAt - self.interval
if self.utc:
timeTuple = time.gmtime(t)
else:
timeTuple = time.localtime(t)
dstThen = timeTuple[-1]
if dstNow != dstThen:
if dstNow:
addend = 3600
else:
addend = -3600
timeTuple = time.localtime(t + addend)
dfn = self.rotation_filename(self.baseFilename + "." +
time.strftime(self.suffix, timeTuple))
if os.path.exists(dfn):
os.remove(dfn)
self.rotate(self.baseFilename, dfn)
if self.backupCount > 0:
for s in self.getFilesToDelete():
os.remove(s)
if not self.delay:
self.stream = self._open()
newRolloverAt = self.computeRollover(currentTime)
while newRolloverAt <= currentTime:
newRolloverAt = newRolloverAt + self.interval
#If DST changes and midnight or weekly rollover, adjust for this.
if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
dstAtRollover = time.localtime(newRolloverAt)[-1]
if dstNow != dstAtRollover:
if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
addend = -3600
else: # DST bows out before next rollover, so we need to add an hour
addend = 3600
newRolloverAt += addend
self.rolloverAt = newRolloverAt

在初始化的过程中,首先会解析时间设定计算时间间隔interval,然后通过computeRollover初始化rolloverAt,也就是翻篇的时间点。在emit的时候,也会直接通过当前时间判断是否超过翻篇时间,超过的话,就会执行doRollover

doRollover总共做了这几件事情:

  • 关闭当前文件流
  • 根据当前时间生成需要归档的日志文件名
  • 执行翻篇,将当前日志文件重命名,归档标注为特定时间
  • 删除冗余的旧文件
  • 如果不是lazy-load,直接调用_open生成新的文件
  • 计算新的翻篇时间点

可以看到,baseFilename其实是个中间态的文件名,实质翻篇操作会是重命名文件,修改为带特定时间信息的文件名。

总结

logging.handlers里,除了上述三种handler外,还有许许多多的handler定义。有兴趣的同学可以自行挖掘!

版权声明
本文为博客HiKariのTechLab原创文章,转载请标明出处,谢谢~~~