起步
异常机制经常涉及到 try except finally
, 其实, 即使没有这几个异常的语法, python执行过程中抛出的异常还是会被虚拟机扑捉到:
>>> 1/0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero
除数0异常
1 / 0
对应的指令:
0 LOAD_CONST 0 (1)
2 LOAD_CONST 1 (0)
4 BINARY_TRUE_DIVIDE
除法操作为:
TARGET(BINARY_TRUE_DIVIDE) {
PyObject *divisor = POP(); // 1
PyObject *dividend = TOP(); // 0
PyObject *quotient = PyNumber_TrueDivide(dividend, divisor);
Py_DECREF(dividend);
Py_DECREF(divisor);
SET_TOP(quotient);
if (quotient == NULL)
goto error;
DISPATCH();
}
PyNumber_TrueDivide
在这个函数里面会对对象类型判断最终走上不同的路径, 其最终执行的是:
static PyObject * long_true_divide(PyObject *v, PyObject *w)
{
PyLongObject *a, *b, *x;
Py_ssize_t a_size, b_size, shift, extra_bits, diff, x_size, x_bits;
digit mask, low;
int inexact, negate, a_is_small, b_is_small;
double dx, result;
CHECK_BINOP(v, w);
a = (PyLongObject *)v;
b = (PyLongObject *)w;
a_size = Py_ABS(Py_SIZE(a));
b_size = Py_ABS(Py_SIZE(b));
negate = (Py_SIZE(a) < 0) ^ (Py_SIZE(b) < 0);
if (b_size == 0) {
PyErr_SetString(PyExc_ZeroDivisionError,
"division by zero");
goto error;
}
...
}
当发现除数是0时, 调用 PyErr_SetString
抛出异常, 在python对象体系中, 异常本身也是对象, 在 pyerrors.h中就定义了很多异常:
[pyerrors.h]
...
PyAPI_DATA(PyObject *) PyExc_SystemExit;
PyAPI_DATA(PyObject *) PyExc_TypeError;
PyAPI_DATA(PyObject *) PyExc_UnboundLocalError;
PyAPI_DATA(PyObject *) PyExc_UnicodeError;
PyAPI_DATA(PyObject *) PyExc_UnicodeEncodeError;
PyAPI_DATA(PyObject *) PyExc_UnicodeDecodeError;
PyAPI_DATA(PyObject *) PyExc_UnicodeTranslateError;
PyAPI_DATA(PyObject *) PyExc_ValueError;
PyAPI_DATA(PyObject *) PyExc_ZeroDivisionError;
记录异常信息
在 PyErr_SetString
后, 会沿着 PyErr_SetString -> PyErr_SetObject -> PyErr_Restore
, 在 PyErr_Restore 中, python将存放这个异常:
void PyErr_Restore(PyObject *type, PyObject *value, PyObject *traceback)
{
PyThreadState *tstate = PyThreadState_GET();
PyObject *oldtype, *oldvalue, *oldtraceback;
if (traceback != NULL && !PyTraceBack_Check(traceback)) {
/* XXX Should never happen -- fatal error instead? */
/* Well, it could be None. */
Py_DECREF(traceback);
traceback = NULL;
}
// 保存以前的异常信息
oldtype = tstate->curexc_type;
oldvalue = tstate->curexc_value;
oldtraceback = tstate->curexc_traceback;
// 设置当前的异常信息
tstate->curexc_type = type;
tstate->curexc_value = value;
tstate->curexc_traceback = traceback;
// 抛弃以前的异常信息
Py_XDECREF(oldtype);
Py_XDECREF(oldvalue);
Py_XDECREF(oldtraceback);
}
最终 curexc_type
存放下了 PyExc_ZeroDivisionError
, 而 curexc_value
存放了字符串 "division by zero"
. 当前的活动线程用 PyThreadState_GET
获得, 将异常信息存放到线程状态对象中.
在python的sys模块中, 提供了一个接口, 我们可以访问python虚拟机存在线程状态中的异常信息:
import sys
try:
1/0
except Exception:
print(sys.exc_info()[0]) # 获得tstate->curexc_type
print(sys.exc_info()[1]) # 获得tstate->curexc_value
字节码中的异常获取
在除法的实现函数中, 我们看到异常信息记录到了线程状态. 而指令运行完后, 在那个巨大的switch块中该如何处理:
PyObject* _Py_HOT_FUNCTION _PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
for (;;) {
switch (opcode) {
// 巨大的switch结构
}
assert(why == WHY_NOT);
why = WHY_EXCEPTION; // 通知虚拟机, 异常发生了
// 尝试捕捉异常
if (why != WHY_NOT)
break;
// 创建traceback对象
PyTraceBack_Here(f);
if (tstate->c_tracefunc != NULL)
call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj,
tstate, f);
...
}
}
在跳出switch之后, WHY_NOT
状态表示虚拟机状态一切正常, 没有发生错误; 设置为 WHY_EXCEPTION
后, 表示在执行字节码的过程中有异常抛出了. 虚拟机意识到有异常发生后, 它就要开始进入异常处理的流程, 这个流程涉及到PyFrameObject对象链表, 这个链表会是输出的异常信息呈链状结构, 设涉及到一个 traceback
对象, 在这个对象中记录栈帧链表信息, python虚拟机利用这个对象将栈帧表中每一个栈帧当前状态可视化.
[traceback.c]
int PyTraceBack_Here(PyFrameObject *frame)
{
PyObject *exc, *val, *tb, *newtb;
// 获得线程中保存线程状态的 traceback 对象
PyErr_Fetch(&exc, &val, &tb);
// 创建新的traceback对象
newtb = (PyObject *)newtracebackobject((PyTracebackObject *)tb, frame);
if (newtb == NULL) {
_PyErr_ChainExceptions(exc, val, tb);
return -1;
}
PyErr_Restore(exc, val, newtb);
Py_XDECREF(tb);
return 0;
}
void PyErr_Fetch(PyObject **p_type, PyObject **p_value, PyObject **p_traceback)
{
PyThreadState *tstate = PyThreadState_GET();
*p_type = tstate->curexc_type;
*p_value = tstate->curexc_value;
*p_traceback = tstate->curexc_traceback;
tstate->curexc_type = NULL;
tstate->curexc_value = NULL;
tstate->curexc_traceback = NULL;
}
PyTracebackObject
是保存在线程状态对象中的, 此结构的定义如下:
[traceback.h]
typedef struct _traceback {
PyObject_HEAD
struct _traceback *tb_next;
struct _frame *tb_frame;
int tb_lasti;
int tb_lineno;
} PyTracebackObject;
里面成员 tb_next
, 可以看出该对象也和PyFrameObject一样, 是一个链表结构, 这个可以从创建traceback对象得到:
[traceback.c]
static PyTracebackObject * newtracebackobject(PyTracebackObject *next, PyFrameObject *frame)
{
PyTracebackObject *tb;
if ((next != NULL && !PyTraceBack_Check(next)) ||
frame == NULL || !PyFrame_Check(frame)) {
PyErr_BadInternalCall();
return NULL;
}
// 申请对象
tb = PyObject_GC_New(PyTracebackObject, &PyTraceBack_Type);
if (tb != NULL) {
Py_XINCREF(next);
tb->tb_next = next; // 建立链表结构
Py_XINCREF(frame);
tb->tb_frame = frame;
tb->tb_lasti = frame->f_lasti;
tb->tb_lineno = PyFrame_GetLineNumber(frame);
PyObject_GC_Track(tb);
}
return tb;
}
next正是前面从线程状态中得到的traceback对象.tb->tb_lineno
是异常处在源代码中对应的行号.
Python虚拟机意识到有异常抛出, 创建了traceback对象后, 就需要开始寻找是否有 except
语句能处理. 如果没有找到, 虚拟机将退出当前活动栈帧, 并沿着栈帧链表回退到上一个栈帧.
PyObject* _Py_HOT_FUNCTION _PyEval_EvalFrameDefault(PyFrameObject *f, int throwflag)
{
for (;;) {
switch (opcode) {
// 巨大的switch结构
}
assert(why == WHY_NOT);
why = WHY_EXCEPTION; // 通知虚拟机, 异常发生了
// 尝试捕捉异常
if (why != WHY_NOT)
break;
// 创建traceback对象
PyTraceBack_Here(f);
if (tstate->c_tracefunc != NULL)
call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj,
tstate, f);
...
}
if (why != WHY_RETURN)
retval = NULL; // 利用retval通知前一个栈帧有异常出现
exit_eval_frame:
if (PyDTrace_FUNCTION_RETURN_ENABLED())
dtrace_function_return(f);
Py_LeaveRecursiveCall();
f->f_executing = 0;
// 将线程状态对象中的活动栈帧设置为上一个栈帧, 完成栈帧回退的动作
tstate->frame = f->f_back;
return _Py_CheckFunctionResult(NULL, retval, "PyEval_EvalFrameEx");
}
可以看到, 如果开发人员没有提供任何捕捉异常的动作, 那么程序将帧的返回值设为 retval = NULL;
, 通过重新设置当前线程状态对象中的活动栈帧, 完成栈帧回退的动作.
异常控制语义结构
当我们在python代码中提供了异常的捕捉动作又会是什么样的呢:
try:
raise Exception("this is an exception")
except Exception as e:
print(e)
finally:
print("finally code")
它的code和字节码:
co_consts : ('this is an exception', None, 'finally code')
co_names : ('Exception', 'e', 'print')
co_stacksize : 23
1 0 SETUP_FINALLY 60 (to 62)
2 SETUP_EXCEPT 12 (to 16)
2 4 LOAD_NAME 0 (Exception)
6 LOAD_CONST 0 ('this is an exception')
8 CALL_FUNCTION 1
10 RAISE_VARARGS 1
12 POP_BLOCK
14 JUMP_FORWARD 42 (to 58)
3 >> 16 DUP_TOP
18 LOAD_NAME 0 (Exception)
20 COMPARE_OP 10 (exception match)
22 POP_JUMP_IF_FALSE 56
24 POP_TOP
26 STORE_NAME 1 (e)
28 POP_TOP
30 SETUP_FINALLY 14 (to 46)
4 32 LOAD_NAME 2 (print)
34 LOAD_NAME 1 (e)
36 CALL_FUNCTION 1
38 POP_TOP
40 POP_BLOCK
42 POP_EXCEPT
44 LOAD_CONST 1 (None)
>> 46 LOAD_CONST 1 (None)
48 STORE_NAME 1 (e)
50 DELETE_NAME 1 (e)
52 END_FINALLY
54 JUMP_FORWARD 2 (to 58)
>> 56 END_FINALLY
>> 58 POP_BLOCK
60 LOAD_CONST 1 (None)
6 >> 62 LOAD_NAME 2 (print)
64 LOAD_CONST 2 ('finally code')
66 CALL_FUNCTION 1
68 POP_TOP
70 END_FINALLY
72 LOAD_CONST 1 (None)
74 RETURN_VALUE
前两条 SETUP_FINALLY
和 SETUP_EXCEPT
指令就是申请一块 PyTryBlock
, 并设置跳转时的目的地址handler
. Exception("this is an exception")
中是通过 CALL_FUNCTION
指令, 该内建函数中会构造出一个异常对象. 创建过程在此不提. 创建后的对象会被压入运行时栈当中. RAISE_VARARGS
指令就是把这个异常对象从运行时栈中取出:
TARGET(RAISE_VARARGS) {
PyObject *cause = NULL, *exc = NULL;
switch (oparg) {
case 2:
cause = POP(); /* cause */
case 1:
exc = POP(); /* exc */
case 0: /* Fallthrough */
if (do_raise(exc, cause)) {
why = WHY_EXCEPTION;
goto fast_block_end;
}
break;
default:
PyErr_SetString(PyExc_SystemError,
"bad RAISE_VARARGS oparg");
break;
}
goto error;
}
参数oparg是1, 显然异常对象取出后赋值给了 exc
. 然后调用 do_raise
函数. 然后设置 why = WHY_EXCEPTION;
后跳转到 fast_block_end
标记中:
fast_block_end:
/* Unwind stacks if a (pseudo) exception occurred */
while (why != WHY_NOT && f->f_iblock > 0) {
/* Peek at the current block. */
f->f_iblock--;
UNWIND_BLOCK(b);
if (why == WHY_EXCEPTION && (b->b_type == SETUP_EXCEPT
|| b->b_type == SETUP_FINALLY)) {
PyObject *exc, *val, *tb;
...
PyErr_Fetch(&exc, &val, &tb);
...
Py_INCREF(tb);
PUSH(tb);
PUSH(val);
PUSH(exc);
why = WHY_NOT;
JUMPTO(handler);
break;
}
在该表机理, 首先从当前PyFrameObject对象中获得 PyTryBlock
, 虚拟机通过 PyErr_Fetch
得到了当前现在状态对象中存储的最新的异常对象和traceback对象, 随后将 tb, val,exc分别压入运行时栈中, 并将why设为WHY_NOT
, 因为需要处理都设置完毕, 让虚拟机执行路径继续前进. 而接下来处理异常的工作就交给程序员指定的代码来解决, 这个通过 JUMP_FORWARD
实现:
TARGET(JUMP_FORWARD) {
JUMPBY(oparg);
FAST_DISPATCH();
}
参数是42, 也就是跳转到48处指令. 在48处指令下文中 POP_JUMP_IF_FALSE
,也就是当没有异常时候, 则再次跳转至56处即finally语句, 如果异常不匹配, 异常信息要重新放回线程状态中, 然后重新设置why的状态, 让python重新进入异常发生状态, 重返异常状态的功能由 END_FINALLY
来完成.
PREDICTED(END_FINALLY);
TARGET(END_FINALLY) {
PyObject *status = POP();
if (PyLong_Check(status)) {
...
}
else if (PyExceptionClass_Check(status)) {
PyObject *exc = POP();
PyObject *tb = POP();
PyErr_Restore(status, exc, tb);
why = WHY_EXCEPTION;
goto fast_block_end;
}
else if (status != Py_None) {
...
}
Py_DECREF(status);
DISPATCH();
}
通过 PyErr_Restore
函数将异常信息重新写回线程状态中.
不管异常是否匹配, 最终都要调用 58 POP_BLOCK
:
PREDICTED(POP_BLOCK);
TARGET(POP_BLOCK) {
PyTryBlock *b = PyFrame_BlockPop(f);
UNWIND_BLOCK(b);
DISPATCH();
}
总结
总结一下, python的异常机制处理中, 最重要的是why所表示的虚拟机状态及PyFrameObject对象中f_blockstack里存放的PyTryBlock对象. 变量why将指明python虚拟机当前是否发生了异常, 而PyTryBlock对象则指示程序员是否为异常设置了 except 代码块和 finally 代码块. python虚拟机处理异常的过程就是 why 和 PyTryBlock 的共同作用下完成的.