One of the most important mantras of software development is “don’t repeat yourself.”
That is, any time you are faced with a problem of creating highly repetitive
code (or cutting or pasting source code), it often pays to look for a more elegant
solution. In Python, such problems are often solved under the category of “metaprogramming.” In
a nutshell, metaprogramming is about creating functions and classes whose main goal is to
manipulate code (e.g., modifying, generating, or wrapping existing code). The
main features for this include decorators, class decorators, and metaclasses. However,
a variety of other useful topics—including signature objects, execution of code with exec(),
and inspecting the internals of functions and classes—enter the picture. The main
purpose of this chapter is to explore various metaprogramming techniques and to give
examples of how they can be used to customize the behavior of Python to your own whims.
You want to put a wrapper layer around a function that adds extra processing (e.g., logging, timing, etc.).
If you ever need to wrap a function with extra code, define a decorator function. For example:
importtimefromfunctoolsimportwrapsdeftimethis(func):'''Decorator that reports the execution time.'''@wraps(func)defwrapper(*args,**kwargs):start=time.time()result=func(*args,**kwargs)end=time.time()(func.__name__,end-start)returnresultreturnwrapper
Here is an example of using the decorator:
>>>@timethis...defcountdown(n):...'''...Counts down...'''...whilen>0:...n-=1...>>>countdown(100000)countdown 0.008917808532714844>>>countdown(10000000)countdown 0.87188299392912>>>
A decorator is a function that accepts a function as input and returns a new function as output. Whenever you write code like this:
@timethisdefcountdown(n):...
it’s the same as if you had performed these separate steps:
defcountdown(n):...countdown=timethis(countdown)
As an aside, built-in decorators such as @staticmethod, @classmethod, and
@property work in the same way. For example, these two code fragments are equivalent:
classA:@classmethoddefmethod(cls):passclassB:# Equivalent definition of a class methoddefmethod(cls):passmethod=classmethod(method)
The code inside a decorator typically involves creating a new function
that accepts any arguments using *args and **kwargs, as shown with
the wrapper() function in this recipe. Inside this function, you
place a call to the original input function and return its result.
However, you also place whatever extra code you want to add (e.g., timing). The
newly created function wrapper is returned as a result and takes the
place of the original function.
It’s critical to emphasize that decorators generally do not alter
the calling signature or return value of the function being wrapped.
The use of *args and **kwargs is there to make sure that any input
arguments can be accepted. The return value of a decorator is almost always
the result of calling func(*args, **kwargs), where func is the original
unwrapped function.
When first learning about decorators, it is usually very easy to get
started with some simple examples, such as the one shown. However, if
you are going to write decorators for real, there are some subtle
details to consider. For example, the use of the decorator @wraps(func) in
the solution is an easy to forget but important technicality related
to preserving function metadata, which is described in the next
recipe. The next few recipes that follow fill in some
details that will be important if you wish to write decorator functions
of your own.
You’ve written a decorator, but when you apply it to a function, important metadata such as the name, doc string, annotations, and calling signature are lost.
Whenever you define a decorator, you should always remember to apply the
@wraps decorator from the functools library to the underlying
wrapper function. For example:
importtimefromfunctoolsimportwrapsdeftimethis(func):'''Decorator that reports the execution time.'''@wraps(func)defwrapper(*args,**kwargs):start=time.time()result=func(*args,**kwargs)end=time.time()(func.__name__,end-start)returnresultreturnwrapper
Here is an example of using the decorator and examining the resulting function metadata:
>>>@timethis...defcountdown(n:int):...'''...Counts down...'''...whilen>0:...n-=1...>>>countdown(100000)countdown 0.008917808532714844>>>countdown.__name__'countdown'>>>countdown.__doc__'\n\tCounts down\n\t'>>>countdown.__annotations__{'n': <class 'int'>}>>>
Copying decorator metadata is an important part of writing decorators.
If you forget to use @wraps, you’ll find that the decorated function
loses all sorts of useful information. For instance, if omitted,
the metadata in the last example would look like this:
>>>countdown.__name__'wrapper'>>>countdown.__doc__>>>countdown.__annotations__{}>>>
An important feature of the @wraps decorator is that it makes the
wrapped function available to you in the __wrapped__ attribute.
For example, if you want to access the wrapped function directly,
you could do this:
>>>countdown.__wrapped__(100000)>>>
The presence of the __wrapped__ attribute also makes decorated
functions properly expose the underlying signature of the wrapped
function. For example:
>>>frominspectimportsignature>>>(signature(countdown))(n:int)>>>
One common question that sometimes arises is how to make a
decorator that directly copies the calling signature of the original
function being wrapped (as opposed to using *args and **kwargs).
In general, this is difficult to implement without resorting to
some trick involving the generator of code strings and exec().
Frankly, you’re usually best off using @wraps and relying on
the fact that the underlying function signature can be propagated
by access to the underlying __wrapped__ attribute. See Recipe 9.16
for more information about signatures.
A decorator has been applied to a function, but you want to “undo” it, gaining access to the original unwrapped function.
Assuming that the decorator has been implemented properly using
@wraps (see Recipe 9.2), you can usually gain access to the
original function by accessing the __wrapped__ attribute. For
example:
>>>@somedecorator>>>defadd(x,y):...returnx+y...>>>orig_add=add.__wrapped__>>>orig_add(3,4)7>>>
Gaining direct access to the unwrapped function behind a decorator
can be useful for debugging, introspection, and other operations
involving functions. However, this recipe only works if the
implementation of a decorator properly copies metadata using
@wraps from the functools module or sets the __wrapped__ attribute
directly.
If multiple decorators have been applied to a function, the behavior
of accessing __wrapped__ is currently undefined and should probably
be avoided. In Python 3.3, it bypasses all of the layers. For example, suppose you have code like this:
fromfunctoolsimportwrapsdefdecorator1(func):@wraps(func)defwrapper(*args,**kwargs):('Decorator 1')returnfunc(*args,**kwargs)returnwrapperdefdecorator2(func):@wraps(func)defwrapper(*args,**kwargs):('Decorator 2')returnfunc(*args,**kwargs)returnwrapper@decorator1@decorator2defadd(x,y):returnx+y
Here is what happens when you call the decorated function and
the original function through __wrapped__:
>>>add(2,3)Decorator 1Decorator 25>>>add.__wrapped__(2,3)5>>>
However, this behavior has been reported as a bug (see http://bugs.python.org/issue17482) and may be changed to explose the proper decorator chain in a future release.
Last, but not least, be aware that not all decorators utilize @wraps,
and thus, they may not work as described. In particular, the built-in decorators @staticmethod and @classmethod create descriptor objects that
don’t follow this convention (instead, they store the original
function in a __func__ attribute). Your mileage may vary.
Let’s illustrate the process of accepting arguments with an example. Suppose you want to write a decorator that adds logging to a function, but allows the user to specify the logging level and other details as arguments. Here is how you might define the decorator:
fromfunctoolsimportwrapsimportloggingdeflogged(level,name=None,message=None):'''Add logging to a function. level is the logginglevel, name is the logger name, and message is thelog message. If name and message aren't specified,they default to the function's module and name.'''defdecorate(func):logname=nameifnameelsefunc.__module__log=logging.getLogger(logname)logmsg=messageifmessageelsefunc.__name__@wraps(func)defwrapper(*args,**kwargs):log.log(level,logmsg)returnfunc(*args,**kwargs)returnwrapperreturndecorate# Example use@logged(logging.DEBUG)defadd(x,y):returnx+y@logged(logging.CRITICAL,'example')defspam():('Spam!')
On first glance, the implementation looks tricky, but the idea is relatively
simple. The outermost function logged() accepts the desired arguments and simply
makes them available to the inner functions of the decorator. The inner function
decorate() accepts a function and puts a wrapper around it as normal. The
key part is that the wrapper is allowed to use the arguments passed to logged().
Writing a decorator that takes arguments is tricky because of the underlying calling sequence involved. Specifically, if you have code like this:
@decorator(x,y,z)deffunc(a,b):pass
The decoration process evaluates as follows:
deffunc(a,b):passfunc=decorator(x,y,z)(func)
Carefully observe that the result of decorator(x, y, z) must be a callable
which, in turn, takes a function as input and wraps it. See Recipe 9.7 for another example of a decorator taking arguments.
You want to write a decorator function that wraps a function, but has user adjustable attributes that can be used to control the behavior of the decorator at runtime.
Here is a solution that expands on the last recipe by introducing
accessor functions that change internal variables through the use
of nonlocal variable declarations. The accessor functions are
then attached to the wrapper function as function attributes.
fromfunctoolsimportwraps,partialimportlogging# Utility decorator to attach a function as an attribute of objdefattach_wrapper(obj,func=None):iffuncisNone:returnpartial(attach_wrapper,obj)setattr(obj,func.__name__,func)returnfuncdeflogged(level,name=None,message=None):'''Add logging to a function. level is the logginglevel, name is the logger name, and message is thelog message. If name and message aren't specified,they default to the function's module and name.'''defdecorate(func):logname=nameifnameelsefunc.__module__log=logging.getLogger(logname)logmsg=messageifmessageelsefunc.__name__@wraps(func)defwrapper(*args,**kwargs):log.log(level,logmsg)returnfunc(*args,**kwargs)# Attach setter functions@attach_wrapper(wrapper)defset_level(newlevel):nonlocallevellevel=newlevel@attach_wrapper(wrapper)defset_message(newmsg):nonlocallogmsglogmsg=newmsgreturnwrapperreturndecorate# Example use@logged(logging.DEBUG)defadd(x,y):returnx+y@logged(logging.CRITICAL,'example')defspam():('Spam!')
Here is an interactive session that shows the various attributes being changed after definition:
>>>importlogging>>>logging.basicConfig(level=logging.DEBUG)>>>add(2,3)DEBUG:__main__:add5>>># Change the log message>>>add.set_message('Add called')>>>add(2,3)DEBUG:__main__:Add called5>>># Change the log level>>>add.set_level(logging.WARNING)>>>add(2,3)WARNING:__main__:Add called5>>>
The key to this recipe lies in the accessor functions [e.g.,
set_message() and set_level()] that get attached to the wrapper as
attributes. Each of these accessors allows internal parameters to be
adjusted through the use of nonlocal assignments.
An amazing feature of this recipe is that the accessor functions
will propagate through multiple levels of decoration (if all
of your decorators utilize @functools.wraps). For example,
suppose you introduced an additional decorator, such as the @timethis
decorator from Recipe 9.2, and wrote code like this:
@timethis@logged(logging.DEBUG)defcountdown(n):whilen>0:n-=1
You’ll find that the accessor methods still work:
>>>countdown(10000000)DEBUG:__main__:countdowncountdown 0.8198461532592773>>>countdown.set_level(logging.WARNING)>>>countdown.set_message("Counting down to zero")>>>countdown(10000000)WARNING:__main__:Counting down to zerocountdown 0.8225970268249512>>>
You’ll also find that it all still works exactly the same way if the decorators are composed in the opposite order, like this:
@logged(logging.DEBUG)@timethisdefcountdown(n):whilen>0:n-=1
Although it’s not shown, accessor functions to return the value of various settings could also be written just as easily by adding extra code such as this:
...@attach_wrapper(wrapper)defget_level():returnlevel# Alternativewrapper.get_level=lambda:level...
One extremely subtle facet of this recipe is the choice to use accessor functions in the first place. For example, you might consider an alternative formulation solely based on direct access to function attributes like this:
...@wraps(func)defwrapper(*args,**kwargs):wrapper.log.log(wrapper.level,wrapper.logmsg)returnfunc(*args,**kwargs)# Attach adjustable attributeswrapper.level=levelwrapper.logmsg=logmsgwrapper.log=log...
This approach would work to a point, but only if it was the topmost decorator.
If you had another decorator applied on top (such as
the @timethis example), it would shadow the underlying attributes and make
them unavailable for modification. The use of accessor functions avoids
this limitation.
Last, but not least, the solution shown in this recipe might be a possible alternative for decorators defined as classes, as shown in Recipe 9.9.
You would like to write a single decorator that can be used without
arguments, such as @decorator, or with optional arguments,
such as @decorator(x,y,z). However, there seems to be
no straightforward way to do it due to differences in
calling conventions between simple decorators and decorators
taking arguments.
Here is a variant of the logging code shown in Recipe 9.5 that defines such a decorator:
fromfunctoolsimportwraps,partialimportloggingdeflogged(func=None,*,level=logging.DEBUG,name=None,message=None):iffuncisNone:returnpartial(logged,level=level,name=name,message=message)logname=nameifnameelsefunc.__module__log=logging.getLogger(logname)logmsg=messageifmessageelsefunc.__name__@wraps(func)defwrapper(*args,**kwargs):log.log(level,logmsg)returnfunc(*args,**kwargs)returnwrapper# Example use@loggeddefadd(x,y):returnx+y@logged(level=logging.CRITICAL,name='example')defspam():('Spam!')
As you can see from the example, the decorator can be used in both
a simple form (i.e., @logged) or with optional arguments supplied (i.e.,
@logged(level=logging.CRITICAL, name='example')).
The problem addressed by this recipe is really one of programming consistency. When using decorators, most programmers are used to applying them without any arguments at all or with arguments, as shown in the example. Technically speaking, a decorator where all arguments are optional could be applied, like this:
@logged()defadd(x,y):returnx+y
However, this is not a form that’s especially common, and might lead to common usage errors if programmers forget to add the extra parentheses. The recipe simply makes the decorator work with or without parentheses in a consistent way.
To understand how the code works, you need to have a firm understanding of how decorators get applied to functions and their calling conventions. For a simple decorator such as this:
# Example use@loggeddefadd(x,y):returnx+y
The calling sequence is as follows:
defadd(x,y):returnx+yadd=logged(add)
In this case, the function to be wrapped is simply passed to logged as
the first argument. Thus, in the solution, the first argument of logged()
is the function being wrapped. All of the other arguments must have default
values.
For a decorator taking arguments such as this:
@logged(level=logging.CRITICAL,name='example')defspam():('Spam!')
The calling sequence is as follows:
defspam():('Spam!')spam=logged(level=logging.CRITICAL,name='example')(spam)
On the initial invocation of logged(), the function to be wrapped is
not passed. Thus, in the decorator, it has to be optional. This, in
turn, forces the other arguments to be specified by keyword.
Furthermore, when arguments are passed, a decorator is supposed to
return a function that accepts the function and wraps it (see Recipe 9.5). To do this, the solution uses a clever trick involving
functools.partial. Specifically, it simply returns a partially
applied version of itself where all arguments are fixed except for the
function to be wrapped. See Recipe 7.8 for more details about using partial().
You want to optionally enforce type checking of function arguments as a kind of assertion or contract.
Before showing the solution code, the aim of this recipe is to have a means of enforcing type contracts on the input arguments to a function. Here is a short example that illustrates the idea:
>>>@typeassert(int,int)...defadd(x,y):...returnx+y...>>>>>>add(2,3)5>>>add(2,'hello')Traceback (most recent call last):File"<stdin>", line1, in<module>File"contract.py", line33, inwrapperTypeError:Argument y must be <class 'int'>>>>
Now, here is an implementation of the @typeassert decorator:
frominspectimportsignaturefromfunctoolsimportwrapsdeftypeassert(*ty_args,**ty_kwargs):defdecorate(func):# If in optimized mode, disable type checkingifnot__debug__:returnfunc# Map function argument names to supplied typessig=signature(func)bound_types=sig.bind_partial(*ty_args,**ty_kwargs).arguments@wraps(func)defwrapper(*args,**kwargs):bound_values=sig.bind(*args,**kwargs)# Enforce type assertions across supplied argumentsforname,valueinbound_values.arguments.items():ifnameinbound_types:ifnotisinstance(value,bound_types[name]):raiseTypeError('Argument {} must be {}'.format(name,bound_types[name]))returnfunc(*args,**kwargs)returnwrapperreturndecorate
You will find that this decorator is rather flexible, allowing types to be specified for all or a subset of a function’s arguments. Moreover, types can be specified by position or by keyword. Here is an example:
>>>@typeassert(int,z=int)...defspam(x,y,z=42):...(x,y,z)...>>>spam(1,2,3)1 2 3>>>spam(1,'hello',3)1 hello 3>>>spam(1,'hello','world')Traceback (most recent call last):File"<stdin>", line1, in<module>File"contract.py", line33, inwrapperTypeError:Argument z must be <class 'int'>>>>
This recipe is an advanced decorator example that introduces a number of important and useful concepts.
First, one aspect of decorators is that they only get applied once,
at the time of function definition. In certain cases, you may want to
disable the functionality added by a decorator. To do this, simply have
your decorator function return the function unwrapped. In the solution,
the following code fragment returns the function unmodified if the value
of the global __debug__ variable is set to False (as is the case
when Python executes in optimized mode with the -O or -OO options
to the interpreter):
...defdecorate(func):# If in optimized mode, disable type checkingifnot__debug__:returnfunc...
Next, a tricky part of writing this decorator is that it involves examining and
working with the argument signature of the function being wrapped. Your tool of choice here should be the
inspect.signature() function. Simply stated,
it allows you to extract signature information from a callable. For example:
>>>frominspectimportsignature>>>defspam(x,y,z=42):...pass...>>>sig=signature(spam)>>>(sig)(x, y, z=42)>>>sig.parametersmappingproxy(OrderedDict([('x', <Parameter at 0x10077a050 'x'>),('y', <Parameter at 0x10077a158 'y'>), ('z', <Parameter at 0x10077a1b0 'z'>)]))>>>sig.parameters['z'].name'z'>>>sig.parameters['z'].default42>>>sig.parameters['z'].kind<_ParameterKind: 'POSITIONAL_OR_KEYWORD'>>>>
In the first part of our decorator, we use the bind_partial() method
of signatures to perform a partial binding of the supplied types to
argument names. Here is an example of what happens:
>>>bound_types=sig.bind_partial(int,z=int)>>>bound_types<inspect.BoundArguments object at 0x10069bb50>>>>bound_types.argumentsOrderedDict([('x', <class 'int'>), ('z', <class 'int'>)])>>>
In this partial binding, you will notice that missing arguments
are simply ignored (i.e., there is no binding for argument y).
However, the most important part of the binding is the creation
of the ordered dictionary bound_types.arguments. This dictionary
maps the argument names to the supplied values in the same
order as the function signature. In the case of our decorator,
this mapping contains the type assertions that we’re going to enforce.
In the actual wrapper function made by the decorator, the sig.bind()
method is used. bind() is like bind_partial() except that it does
not allow for missing arguments. So, here is what happens:
>>>bound_values=sig.bind(1,2,3)>>>bound_values.argumentsOrderedDict([('x', 1), ('y', 2), ('z', 3)])>>>
Using this mapping, it is relatively easy to enforce the required assertions.
>>>forname,valueinbound_values.arguments.items():...ifnameinbound_types.arguments:...ifnotisinstance(value,bound_types.arguments[name]):...raiseTypeError()...>>>
A somewhat subtle aspect of the solution is that the assertions do not
get applied to unsupplied arguments with default values.
For example, this code works, even though the default value of items is
of the “wrong” type:
>>>@typeassert(int,list)...defbar(x,items=None):...ifitemsisNone:...items=[]...items.append(x)...returnitems>>>bar(2)[2]>>>bar(2,3)Traceback (most recent call last):File"<stdin>", line1, in<module>File"contract.py", line33, inwrapperTypeError:Argument items must be <class 'list'>>>>bar(4,[1,2,3])[1, 2, 3, 4]>>>
A final point of design discussion might be the use of decorator arguments versus function annotations. For example, why not write the decorator to look at annotations like this?
@typeassertdefspam(x:int,y,z:int=42):(x,y,z)
One possible reason for not using annotations is that each argument to a function
can only have a single annotation assigned. Thus, if the annotations are used
for type assertions, they can’t really be used for anything else. Likewise, the
@typeassert decorator won’t work with functions that use annotations for a
different purpose. By using decorator arguments, as shown in the solution,
the decorator becomes a lot more general purpose and can be used with any function
whatsoever—even functions that use annotations.
More information about function signature objects can be found in PEP 362, as well as the documentation for the inspect module. Recipe 9.16 also
has an additional example.
You want to define a decorator inside a class definition and apply it to other functions or methods.
Defining a decorator inside a class is straightforward, but you first need to sort out the manner in which the decorator will be applied. Specifically, whether it is applied as an instance or a class method. Here is an example that illustrates the difference:
fromfunctoolsimportwrapsclassA:# Decorator as an instance methoddefdecorator1(self,func):@wraps(func)defwrapper(*args,**kwargs):('Decorator 1')returnfunc(*args,**kwargs)returnwrapper# Decorator as a class method@classmethoddefdecorator2(cls,func):@wraps(func)defwrapper(*args,**kwargs):('Decorator 2')returnfunc(*args,**kwargs)returnwrapper
Here is an example of how the two decorators would be applied:
# As an instance methoda=A()@a.decorator1defspam():pass# As a class method@A.decorator2defgrok():pass
If you look carefully, you’ll notice that one is applied from an instance a and the other
is applied from the class A.
Defining decorators in a class might look odd at first glance, but there
are examples of this in the standard library. In particular, the
built-in @property decorator is actually a class with getter(),
setter(), and deleter() methods that each act as a decorator.
For example:
classPerson:# Create a property instancefirst_name=property()# Apply decorator methods@first_name.getterdeffirst_name(self):returnself._first_name@first_name.setterdeffirst_name(self,value):ifnotisinstance(value,str):raiseTypeError('Expected a string')self._first_name=value
The key reason why it’s defined in this way is that the various
decorator methods are manipulating state on the associated property
instance. So, if you ever had a problem where decorators needed to
record or combine information behind the scenes, it’s a sensible
approach.
A common confusion when writing decorators in classes is getting
tripped up by the proper use of the extra self or cls arguments in the
decorator code itself. Although the outermost decorator function,
such as decorator1() or decorator2(), needs to provide a self or
cls argument (since they’re part of a class), the wrapper function
created inside doesn’t generally need to include an extra argument.
This is why the wrapper() function created in both decorators doesn’t
include a self argument. The only time you would ever need this
argument is in situations where you actually needed to access parts
of an instance in the wrapper. Otherwise, you just don’t have to worry
about it.
A final subtle facet of having decorators defined in a class concerns
their potential use with inheritance. For example, suppose you want to apply
one of the decorators defined in class A to methods defined in a
subclass B. To do that, you would need to write code like this:
classB(A):@A.decorator2defbar(self):pass
In particular, the decorator in question has to be defined as a class
method and you have to explicitly use the name of the superclass A
when applying it. You can’t use a name such as @B.decorator2, because
at the time of method definition, class B has not yet been
created.
You want to wrap functions with a decorator, but the result is going to be a callable instance. You need your decorator to work both inside and outside class definitions.
To define a decorator as an instance, you need to make sure it implements
the __call__() and __get__() methods. For example, this code defines a class that puts
a simple profiling layer around another function:
importtypesfromfunctoolsimportwrapsclassProfiled:def__init__(self,func):wraps(func)(self)self.ncalls=0def__call__(self,*args,**kwargs):self.ncalls+=1returnself.__wrapped__(*args,**kwargs)def__get__(self,instance,cls):ifinstanceisNone:returnselfelse:returntypes.MethodType(self,instance)
To use this class, you use it like a normal decorator, either inside or outside of a class:
@Profileddefadd(x,y):returnx+yclassSpam:@Profileddefbar(self,x):(self,x)
Here is an interactive session that shows how these functions work:
>>>add(2,3)5>>>add(4,5)9>>>add.ncalls2>>>s=Spam()>>>s.bar(1)<__main__.Spam object at 0x10069e9d0> 1>>>s.bar(2)<__main__.Spam object at 0x10069e9d0> 2>>>s.bar(3)<__main__.Spam object at 0x10069e9d0> 3>>>Spam.bar.ncalls3
Defining a decorator as a class is usually straightforward. However, there are some rather subtle details that deserve more explanation, especially if you plan to apply the decorator to instance methods.
First, the use of the functools.wraps() function serves the same
purpose here as it does in normal decorators—namely to copy important
metadata from the wrapped function to the callable instance.
Second, it is common
to overlook the __get__() method shown in the solution. If you omit
the __get__() and keep all of the other code the same, you’ll
find that bizarre things happen when you try to invoke decorated
instance methods. For example:
>>>s=Spam()>>>s.bar(3)Traceback (most recent call last):...TypeError:spam() missing 1 required positional argument: 'x'
The reason it breaks is that whenever functions implementing methods
are looked up in a class, their __get__() method is invoked as part
of the descriptor protocol, which is described in Recipe 8.9. In
this case, the purpose of __get__() is to create a bound method
object (which ultimately supplies the self argument to the method).
Here is an example that illustrates the underlying mechanics:
>>>s=Spam()>>>defgrok(self,x):...pass...>>>grok.__get__(s,Spam)<bound method Spam.grok of <__main__.Spam object at 0x100671e90>>>>>
In this recipe, the __get__() method is there to make sure bound method
objects get created properly. type.MethodType() creates a bound method
manually for use here. Bound methods only get created if an instance is
being used. If the method is accessed on a class, the instance argument to
__get__() is set to None and the Profiled instance itself
is just returned. This makes it possible for someone to extract its ncalls attribute,
as shown.
If you want to avoid some of this of this mess, you might consider
an alternative formulation of the decorator using closures and nonlocal
variables, as described in Recipe 9.5. For example:
importtypesfromfunctoolsimportwrapsdefprofiled(func):ncalls=0@wraps(func)defwrapper(*args,**kwargs):nonlocalncallsncalls+=1returnfunc(*args,**kwargs)wrapper.ncalls=lambda:ncallsreturnwrapper# Example@profileddefadd(x,y):returnx+y
This example almost works in exactly the same way except that
access to ncalls is now provided through a function attached as
a function attribute. For example:
>>>add(2,3)5>>>add(4,5)9>>>add.ncalls()2>>>
Applying decorators to class and static methods is straightforward, but make
sure that your decorators are applied before @classmethod or @staticmethod.
For example:
importtimefromfunctoolsimportwraps# A simple decoratordeftimethis(func):@wraps(func)defwrapper(*args,**kwargs):start=time.time()r=func(*args,**kwargs)end=time.time()(end-start)returnrreturnwrapper# Class illustrating application of the decorator to different kinds of methodsclassSpam:@timethisdefinstance_method(self,n):(self,n)whilen>0:n-=1@classmethod@timethisdefclass_method(cls,n):(cls,n)whilen>0:n-=1@staticmethod@timethisdefstatic_method(n):(n)whilen>0:n-=1
The resulting class and static methods should operate normally, but have the extra timing:
>>>s=Spam()>>>s.instance_method(1000000)<__main__.Spam object at 0x1006a6050> 10000000.11817407608032227>>>Spam.class_method(1000000)<class '__main__.Spam'> 10000000.11334395408630371>>>Spam.static_method(1000000)10000000.11740279197692871>>>
If you get the order of decorators wrong, you’ll get an error. For example, if you use the following:
classSpam:...@timethis@staticmethoddefstatic_method(n):(n)whilen>0:n-=1
Then the static method will crash:
>>>Spam.static_method(1000000)Traceback (most recent call last):File"<stdin>", line1, in<module>File"timethis.py", line6, inwrapperstart=time.time()TypeError:'staticmethod' object is not callable>>>
The problem here is that @classmethod and @staticmethod don’t
actually create objects that are directly callable. Instead, they create special
descriptor objects, as described in Recipe 8.9. Thus, if you try to use them like
functions in another decorator, the decorator will crash. Making
sure that these decorators appear first in the decorator list
fixes the problem.
One situation where this recipe is of critical importance is in defining class and static methods in abstract base classes, as described in Recipe 8.12. For example, if you want to define an abstract class method, you can use this code:
fromabcimportABCMeta,abstractmethodclassA(metaclass=ABCMeta):@classmethod@abstractmethoddefmethod(cls):pass
In this code, the order of @classmethod and @abstractmethod matters. If you flip
the two decorators around, everything breaks.
You want to write a decorator that adds an extra argument to the calling signature of the wrapped function. However, the added argument can’t interfere with the existing calling conventions of the function.
Extra arguments can be injected into the calling signature using keyword-only arguments. Consider the following decorator:
fromfunctoolsimportwrapsdefoptional_debug(func):@wraps(func)defwrapper(*args,debug=False,**kwargs):ifdebug:('Calling',func.__name__)returnfunc(*args,**kwargs)returnwrapper
Here is an example of how the decorator works:
>>>@optional_debug...defspam(a,b,c):...(a,b,c)...>>>spam(1,2,3)1 2 3>>>spam(1,2,3,debug=True)Calling spam1 2 3>>>
Adding arguments to the signature of wrapped functions is not the most common example of using decorators. However, it might be a useful technique in avoiding certain kinds of code replication patterns. For example, if you have code like this:
defa(x,debug=False):ifdebug:('Calling a')...defb(x,y,z,debug=False):ifdebug:('Calling b')...defc(x,y,debug=False):ifdebug:('Calling c')...
You can refactor it into the following:
@optional_debugdefa(x):...@optional_debugdefb(x,y,z):...@optional_debugdefc(x,y):...
The implementation of this recipe relies on the fact that keyword-only
arguments are easy to add to functions that also accept *args and
**kwargs parameters. By using a keyword-only argument, it gets
singled out as a special case and removed from subsequent calls that
only use the remaining positional and keyword arguments.
One tricky part here concerns a potential name clash between
the added argument and the arguments of the function being wrapped.
For example, if the @optional_debug decorator was applied to a
function that already had a debug argument, then it would break.
If that’s a concern, an extra check could be added:
fromfunctoolsimportwrapsimportinspectdefoptional_debug(func):if'debug'ininspect.getargspec(func).args:raiseTypeError('debug argument already defined')@wraps(func)defwrapper(*args,debug=False,**kwargs):ifdebug:('Calling',func.__name__)returnfunc(*args,**kwargs)returnwrapper
A final refinement to this recipe concerns the proper management of function signatures. An astute programmer will realize that the signature of wrapped functions is wrong. For example:
>>>@optional_debug...defadd(x,y):...returnx+y...>>>importinspect>>>(inspect.signature(add))(x, y)>>>
This can be fixed by making the following modification:
fromfunctoolsimportwrapsimportinspectdefoptional_debug(func):if'debug'ininspect.getargspec(func).args:raiseTypeError('debug argument already defined')@wraps(func)defwrapper(*args,debug=False,**kwargs):ifdebug:('Calling',func.__name__)returnfunc(*args,**kwargs)sig=inspect.signature(func)parms=list(sig.parameters.values())parms.append(inspect.Parameter('debug',inspect.Parameter.KEYWORD_ONLY,default=False))wrapper.__signature__=sig.replace(parameters=parms)returnwrapper
With this change, the signature of the wrapper will now correctly
reflect the presence of the debug argument. For example:
>>>@optional_debug...defadd(x,y):...returnx+y...>>>(inspect.signature(add))(x, y, *, debug=False)>>>add(2,3)5>>>
See Recipe 9.16 for more information about function signatures.
You want to inspect or rewrite portions of a class definition to alter its behavior, but without using inheritance or metaclasses.
This might be a perfect use for a class decorator. For
example, here is a class decorator that rewrites the
__getattribute__ special method to perform logging.
deflog_getattribute(cls):# Get the original implementationorig_getattribute=cls.__getattribute__# Make a new definitiondefnew_getattribute(self,name):('getting:',name)returnorig_getattribute(self,name)# Attach to the class and returncls.__getattribute__=new_getattributereturncls# Example use@log_getattributeclassA:def__init__(self,x):self.x=xdefspam(self):pass
Here is what happens if you try to use the class in the solution:
>>>a=A(42)>>>a.xgetting: x42>>>a.spam()getting: spam>>>
Class decorators can often be used as a straightforward alternative to other more advanced techniques involving mixins or metaclasses. For example, an alternative implementation of the solution might involve inheritance, as in the following:
classLoggedGetattribute:def__getattribute__(self,name):('getting:',name)returnsuper().__getattribute__(name)# Example:classA(LoggedGetattribute):def__init__(self,x):self.x=xdefspam(self):pass
This works, but to understand it, you have to have some awareness of
the method resolution order, super(), and other aspects of
inheritance, as described in Recipe 8.7. In some sense, the class
decorator solution is much more direct in how it operates, and it
doesn’t introduce new dependencies into the inheritance hierarchy. As
it turns out, it’s also just a bit faster, due to not relying on the
super() function.
If you are applying multiple class decorators to a class, the application order might matter. For example, a decorator that replaces a method with an entirely new implementation would probably need to be applied before a decorator that simply wraps an existing method with some extra logic.
See Recipe 8.13 for another example of class decorators in action.
You want to change the way in which instances are created in order to implement singletons, caching, or other similar features.
As Python programmers know, if you define a class, you call it like a function to create instances. For example:
classSpam:def__init__(self,name):self.name=namea=Spam('Guido')b=Spam('Diana')
If you want to customize this step, you can do it by defining a
metaclass and reimplementing its __call__() method in some way.
To illustrate, suppose that you didn’t want anyone creating instances
at all:
classNoInstances(type):def__call__(self,*args,**kwargs):raiseTypeError("Can't instantiate directly")# ExampleclassSpam(metaclass=NoInstances):@staticmethoddefgrok(x):('Spam.grok')
In this case, users can call the defined static method, but it’s impossible to create an instance in the normal way. For example:
>>>Spam.grok(42)Spam.grok>>>s=Spam()Traceback (most recent call last):File"<stdin>", line1, in<module>File"example1.py", line7, in__call__raiseTypeError("Can't instantiate directly")TypeError:Can't instantiate directly>>>
Now, suppose you want to implement the singleton pattern (i.e., a class where only one instance is ever created). That is also relatively straightforward, as shown here:
classSingleton(type):def__init__(self,*args,**kwargs):self.__instance=Nonesuper().__init__(*args,**kwargs)def__call__(self,*args,**kwargs):ifself.__instanceisNone:self.__instance=super().__call__(*args,**kwargs)returnself.__instanceelse:returnself.__instance# ExampleclassSpam(metaclass=Singleton):def__init__(self):('Creating Spam')
In this case, only one instance ever gets created. For example:
>>>a=Spam()Creating Spam>>>b=Spam()>>>aisbTrue>>>c=Spam()>>>aiscTrue>>>
Finally, suppose you want to create cached instances, as described in Recipe 8.25. Here’s a metaclass that implements it:
importweakrefclassCached(type):def__init__(self,*args,**kwargs):super().__init__(*args,**kwargs)self.__cache=weakref.WeakValueDictionary()def__call__(self,*args):ifargsinself.__cache:returnself.__cache[args]else:obj=super().__call__(*args)self.__cache[args]=objreturnobj# ExampleclassSpam(metaclass=Cached):def__init__(self,name):('Creating Spam({!r})'.format(name))self.name=name
Here’s an example showing the behavior of this class:
>>>a=Spam('Guido')Creating Spam('Guido')>>>b=Spam('Diana')Creating Spam('Diana')>>>c=Spam('Guido')# Cached>>>aisbFalse>>>aisc# Cached value returnedTrue>>>
Using a metaclass to implement various instance creation patterns can often be a much more elegant approach than other solutions not involving metaclasses. For example, if you didn’t use a metaclass, you might have to hide the classes behind some kind of extra factory function. For example, to get a singleton, you might use a hack such as the following:
class_Spam:def__init__(self):('Creating Spam')_spam_instance=NonedefSpam():global_spam_instanceif_spam_instanceisnotNone:return_spam_instanceelse:_spam_instance=_Spam()return_spam_instance
Although the solution involving metaclasses involves a much more advanced concept, the resulting code feels cleaner and less hacked together.
See Recipe 8.25 for more information on creating cached instances, weak references, and other details.
You want to automatically record the order in which attributes and methods are defined inside a class body so that you can use it in various operations (e.g., serializing, mapping to databases, etc.).
Capturing information about the body of class definition
is easily accomplished through the use of a metaclass. Here is
an example of a metaclass that uses an OrderedDict to capture
definition order of descriptors:
fromcollectionsimportOrderedDict# A set of descriptors for various typesclassTyped:_expected_type=type(None)def__init__(self,name=None):self._name=namedef__set__(self,instance,value):ifnotisinstance(value,self._expected_type):raiseTypeError('Expected '+str(self._expected_type))instance.__dict__[self._name]=valueclassInteger(Typed):_expected_type=intclassFloat(Typed):_expected_type=floatclassString(Typed):_expected_type=str# Metaclass that uses an OrderedDict for class bodyclassOrderedMeta(type):def__new__(cls,clsname,bases,clsdict):d=dict(clsdict)order=[]forname,valueinclsdict.items():ifisinstance(value,Typed):value._name=nameorder.append(name)d['_order']=orderreturntype.__new__(cls,clsname,bases,d)@classmethoddef__prepare__(cls,clsname,bases):returnOrderedDict()
In this metaclass, the definition order of descriptors is captured by using
an OrderedDict during the execution of the class body. The resulting order
of names is then extracted from the dictionary and stored into a class attribute
_order. This can then be used by methods of the class
in various ways. For example, here is a simple class that
uses the ordering to implement a method for serializing the instance
data as a line of CSV data:
classStructure(metaclass=OrderedMeta):defas_csv(self):return','.join(str(getattr(self,name))fornameinself._order)# Example useclassStock(Structure):name=String()shares=Integer()price=Float()def__init__(self,name,shares,price):self.name=nameself.shares=sharesself.price=price
Here is an interactive session illustrating the use of the
Stock class in the example:
>>>s=Stock('GOOG',100,490.1)>>>s.name'GOOG'>>>s.as_csv()'GOOG,100,490.1'>>>t=Stock('AAPL','a lot',610.23)Traceback (most recent call last):File"<stdin>", line1, in<module>File"dupmethod.py", line34, in__init__TypeError:shares expects <class 'int'>>>>
The entire key to this recipe is the __prepare__() method,
which is defined in the OrderedMeta metaclass. This method is invoked
immediately at the start of a class definition with the
class name and base classes. It must then return a mapping
object to use when processing the class body. By returning
an OrderedDict instead of a normal dictionary, the resulting
definition order is easily captured.
It is possible to extend this functionality even further if you are willing to make your own dictionary-like objects. For example, consider this variant of the solution that rejects duplicate definitions:
fromcollectionsimportOrderedDictclassNoDupOrderedDict(OrderedDict):def__init__(self,clsname):self.clsname=clsnamesuper().__init__()def__setitem__(self,name,value):ifnameinself:raiseTypeError('{} already defined in {}'.format(name,self.clsname))super().__setitem__(name,value)classOrderedMeta(type):def__new__(cls,clsname,bases,clsdict):d=dict(clsdict)d['_order']=[namefornameinclsdictifname[0]!='_']returntype.__new__(cls,clsname,bases,d)@classmethoddef__prepare__(cls,clsname,bases):returnNoDupOrderedDict(clsname)
Here’s what happens if you use this metaclass and make a class with duplicate entries:
>>>classA(metaclass=OrderedMeta):...defspam(self):...pass...defspam(self):...pass...Traceback (most recent call last):File"<stdin>", line1, in<module>File"<stdin>", line4, inAFile"dupmethod2.py", line25, in__setitem__(name,self.clsname))TypeError:spam already defined in A>>>
A final important part of this recipe concerns the treatment of the
modified dictionary in the metaclass __new__() method. Even though
the class was defined using an alternative dictionary, you still have to
convert this dictionary to a proper dict instance when making the
final class object. This is the purpose of the d = dict(clsdict) statement.
Being able to capture definition order is a subtle but important feature for certain kinds of applications. For instance, in an object relational mapper, classes might be written in a manner similar to that shown in the example:
classStock(Model):name=String()shares=Integer()price=Float()
Underneath the covers, the code might want to capture the definition
order to map objects to tuples or rows in a database table (e.g.,
similar to the functionality of the as_csv() method in the example).
The solution shown is very straightforward and often simpler than
alternative approaches (which typically involve maintaining hidden
counters within the descriptor classes).
You want to define a metaclass that allows class definitions to supply optional arguments, possibly to control or configure aspects of processing during type creation.
When defining classes, Python allows a metaclass to be specified using
the metaclass keyword argument in the class statement. For
example, with abstract base classes:
fromabcimportABCMeta,abstractmethodclassIStream(metaclass=ABCMeta):@abstractmethoddefread(self,maxsize=None):pass@abstractmethoddefwrite(self,data):pass
However, in custom metaclasses, additional keyword arguments can be supplied, like this:
classSpam(metaclass=MyMeta,debug=True,synchronize=True):...
To support such keyword arguments in a metaclass, make sure you define them on the __prepare__(), __new__(), and __init__() methods using
keyword-only arguments, like this:
classMyMeta(type):# Optional@classmethoddef__prepare__(cls,name,bases,*,debug=False,synchronize=False):# Custom processing...returnsuper().__prepare__(name,bases)# Requireddef__new__(cls,name,bases,ns,*,debug=False,synchronize=False):# Custom processing...returnsuper().__new__(cls,name,bases,ns)# Requireddef__init__(self,name,bases,ns,*,debug=False,synchronize=False):# Custom processing...super().__init__(name,bases,ns)
Adding optional keyword arguments to a metaclass requires that you
understand all of the steps involved in class creation, because the
extra arguments are passed to every method involved. The
__prepare__() method is called first and used to create the class
namespace prior to the body of any class definition being processed.
Normally, this method simply returns a dictionary or other mapping
object. The __new__() method is used to instantiate the resulting
type object. It is called after the class body has been fully
executed. The __init__() method is called last and used to perform
any additional initialization steps.
When writing metaclasses, it is somewhat common to only define a
__new__() or __init__() method, but not both. However, if extra
keyword arguments are going to be accepted, then both methods must be
provided and given compatible signatures. The default __prepare__()
method accepts any set of keyword arguments, but ignores them. You
only need to define it yourself if the extra arguments
would somehow affect management of the class namespace
creation.
The use of keyword-only arguments in this recipe reflects the fact that such arguments will only be supplied by keyword during class creation.
The specification of keyword arguments to configure a metaclass might be viewed as an alternative to using class variables for a similar purpose. For example:
classSpam(metaclass=MyMeta):debug=Truesynchronize=True...
The advantage to supplying such parameters as an argument is that they
don’t pollute the class namespace with extra names that only pertain
to class creation and not the subsequent execution of statements in the
class. In addition, they are available to the __prepare__() method,
which runs prior to processing any statements in the class body.
Class variables, on the other hand, would only be accessible in the __new__()
and __init__() methods of a metaclass.
You’ve written a function or method that uses *args and **kwargs,
so that it can be general purpose, but you would also like to check
the passed arguments to see if they match a specific function calling
signature.
For any problem where you want to manipulate function calling
signatures, you should use the signature features found in the inspect
module. Two classes, Signature and Parameter, are of particular
interest here. Here is an interactive example of creating a function
signature:
>>>frominspectimportSignature,Parameter>>># Make a signature for a func(x, y=42, *, z=None)>>>parms=[Parameter('x',Parameter.POSITIONAL_OR_KEYWORD),...Parameter('y',Parameter.POSITIONAL_OR_KEYWORD,default=42),...Parameter('z',Parameter.KEYWORD_ONLY,default=None)]>>>sig=Signature(parms)>>>(sig)(x, y=42, *, z=None)>>>
Once you have a signature object, you can easily bind it to *args and **kwargs
using the signature’s bind() method, as shown in this simple example:
>>>deffunc(*args,**kwargs):...bound_values=sig.bind(*args,**kwargs)...forname,valueinbound_values.arguments.items():...(name,value)...>>># Try various examples>>>func(1,2,z=3)x 1y 2z 3>>>func(1)x 1>>>func(1,z=3)x 1z 3>>>func(y=2,x=1)x 1y 2>>>func(1,2,3,4)Traceback (most recent call last):...File"/usr/local/lib/python3.3/inspect.py", line1972, in_bindraiseTypeError('too many positional arguments')TypeError:too many positional arguments>>>func(y=2)Traceback (most recent call last):...File"/usr/local/lib/python3.3/inspect.py", line1961, in_bindraiseTypeError(msg)fromNoneTypeError:'x' parameter lacking default value>>>func(1,y=2,x=3)Traceback (most recent call last):...File"/usr/local/lib/python3.3/inspect.py", line1985, in_bind'{arg!r}'.format(arg=param.name))TypeError:multiple values for argument 'x'>>>
As you can see, the binding of a signature to the passed arguments enforces all of the usual function calling rules concerning required arguments, defaults, duplicates, and so forth.
Here is a more concrete example of enforcing function signatures.
In this code, a base class has defined an extremely general-purpose
version of __init__(), but subclasses are expected to supply an
expected signature.
frominspectimportSignature,Parameterdefmake_sig(*names):parms=[Parameter(name,Parameter.POSITIONAL_OR_KEYWORD)fornameinnames]returnSignature(parms)classStructure:__signature__=make_sig()def__init__(self,*args,**kwargs):bound_values=self.__signature__.bind(*args,**kwargs)forname,valueinbound_values.arguments.items():setattr(self,name,value)# Example useclassStock(Structure):__signature__=make_sig('name','shares','price')classPoint(Structure):__signature__=make_sig('x','y')
Here is an example of how the Stock class works:
>>>importinspect>>>(inspect.signature(Stock))(name, shares, price)>>>s1=Stock('ACME',100,490.1)>>>s2=Stock('ACME',100)Traceback (most recent call last):...TypeError:'price' parameter lacking default value>>>s3=Stock('ACME',100,490.1,shares=50)Traceback (most recent call last):...TypeError:multiple values for argument 'shares'>>>
The use of functions involving *args and **kwargs is very
common when trying to make general-purpose libraries, write decorators or
implement proxies. However, one
downside of such functions is that if you want to implement your own
argument checking, it can quickly become an unwieldy mess. As an
example, see Recipe 8.11. The use of a signature
object simplifies this.
In the last example of the solution, it might make sense to create signature objects through the use of a custom metaclass. Here is an alternative implementation that shows how to do this:
frominspectimportSignature,Parameterdefmake_sig(*names):parms=[Parameter(name,Parameter.POSITIONAL_OR_KEYWORD)fornameinnames]returnSignature(parms)classStructureMeta(type):def__new__(cls,clsname,bases,clsdict):clsdict['__signature__']=make_sig(*clsdict.get('_fields',[]))returnsuper().__new__(cls,clsname,bases,clsdict)classStructure(metaclass=StructureMeta):_fields=[]def__init__(self,*args,**kwargs):bound_values=self.__signature__.bind(*args,**kwargs)forname,valueinbound_values.arguments.items():setattr(self,name,value)# ExampleclassStock(Structure):_fields=['name','shares','price']classPoint(Structure):_fields=['x','y']
When defining custom signatures, it is often useful to store the signature
in a special attribute __signature__, as shown. If you do this, code
that uses the inspect module to perform introspection will see the signature
and report it as the calling convention. For example:
>>>importinspect>>>(inspect.signature(Stock))(name, shares, price)>>>(inspect.signature(Point))(x, y)>>>
Your program consists of a large class hierarchy and you would like to enforce certain kinds of coding conventions (or perform diagnostics) to help maintain programmer sanity.
If you want to monitor the definition of classes, you can often do it
by defining a metaclass. A basic metaclass is usually defined by
inheriting from type and redefining its __new__() method or
__init__() method. For example:
classMyMeta(type):def__new__(self,clsname,bases,clsdict):# clsname is name of class being defined# bases is tuple of base classes# clsdict is class dictionaryreturnsuper().__new__(cls,clsname,bases,clsdict)
Alternatively, if __init__() is defined:
classMyMeta(type):def__init__(self,clsname,bases,clsdict):super().__init__(clsname,bases,clsdict)# clsname is name of class being defined# bases is tuple of base classes# clsdict is class dictionary
To use a metaclass, you would generally incorporate it into a top-level base class from which other objects inherit. For example:
classRoot(metaclass=MyMeta):passclassA(Root):passclassB(Root):pass
A key feature of a metaclass is that it allows you to examine the
contents of a class at the time of definition. Inside the redefined
__init__() method, you are free to inspect the class dictionary,
base classes, and more. Moreover, once a metaclass has been specified
for a class, it gets inherited by all of the subclasses. Thus, a
sneaky framework builder can specify a metaclass for one of the
top-level classes in a large hierarchy and capture the definition of
all classes under it.
As a concrete albeit whimsical example, here is a metaclass that rejects any class definition containing methods with mixed-case names (perhaps as a means for annoying Java programmers):
classNoMixedCaseMeta(type):def__new__(cls,clsname,bases,clsdict):fornameinclsdict:ifname.lower()!=name:raiseTypeError('Bad attribute name: '+name)returnsuper().__new__(cls,clsname,bases,clsdict)classRoot(metaclass=NoMixedCaseMeta):passclassA(Root):deffoo_bar(self):# OkpassclassB(Root):deffooBar(self):# TypeErrorpass
As a more advanced and useful example, here is a metaclass that checks the definition of redefined methods to make sure they have the same calling signature as the original method in the superclass.
frominspectimportsignatureimportloggingclassMatchSignaturesMeta(type):def__init__(self,clsname,bases,clsdict):super().__init__(clsname,bases,clsdict)sup=super(self,self)forname,valueinclsdict.items():ifname.startswith('_')ornotcallable(value):continue# Get the previous definition (if any) and compare the signaturesprev_dfn=getattr(sup,name,None)ifprev_dfn:prev_sig=signature(prev_dfn)val_sig=signature(value)ifprev_sig!=val_sig:logging.warning('Signature mismatch in%s.%s!=%s',value.__qualname__,prev_sig,val_sig)# ExampleclassRoot(metaclass=MatchSignaturesMeta):passclassA(Root):deffoo(self,x,y):passdefspam(self,x,*,z):pass# Class with redefined methods, but slightly different signaturesclassB(A):deffoo(self,a,b):passdefspam(self,x,z):pass
If you run this code, you will get output such as the following:
WARNING:root:Signature mismatch in B.spam. (self, x, *, z) != (self, x, z)
WARNING:root:Signature mismatch in B.foo. (self, x, y) != (self, a, b)Such warnings might be useful in catching subtle program bugs. For example, code that relies on keyword argument passing to a method will break if a subclass changes the argument names.
In large object-oriented programs, it can sometimes be useful to put class definitions under the control of a metaclass. The metaclass can observe class definitions and be used to alert programmers to potential problems that might go unnoticed (e.g., using slightly incompatible method signatures).
One might argue that such errors would be better caught by program analysis tools or IDEs. To be sure, such tools are useful. However, if you’re creating a framework or library that’s going to be used by others, you often don’t have any control over the rigor of their development practices. Thus, for certain kinds of applications, it might make sense to put a bit of extra checking in a metaclass if such checking would result in a better user experience.
The choice of redefining __new__() or __init__() in a metaclass
depends on how you want to work with the resulting class. __new__()
is invoked prior to class creation and is typically used when a
metaclass wants to alter the class definition in some way (by changing
the contents of the class dictionary). The __init__() method is
invoked after a class has been created, and is useful if you want to
write code that works with the fully formed class object. In the last
example, this is essential since it is using the super() function to
search for prior definitions. This only works once the class instance
has been created and the underlying method resolution order has
been set.
The last example also illustrates the use of Python’s function signature
objects. Essentially, the metaclass takes each callable definition
in a class, searches for a prior definition (if any), and then
simply compares their calling signatures using inspect.signature().
Last, but not least, the line of code that uses super(self, self) is
not a typo. When working with a metaclass, it’s important to realize
that the self is actually a class object. So, that statement is
actually being used to find definitions located further up the class
hierarchy that make up the parents of self.
You’re writing code that ultimately needs to create a new class
object. You’ve thought about emitting emit class source code to a
string and using a function such as exec() to evaluate it, but you’d
prefer a more elegant solution.
You can use the function types.new_class() to instantiate
new class objects. All you need to do is provide the name of the
class, tuple of parent classes, keyword arguments, and a callback that
populates the class dictionary with members. For
example:
# stock.py# Example of making a class manually from parts# Methodsdef__init__(self,name,shares,price):self.name=nameself.shares=sharesself.price=pricedefcost(self):returnself.shares*self.pricecls_dict={'__init__':__init__,'cost':cost,}# Make a classimporttypesStock=types.new_class('Stock',(),{},lambdans:ns.update(cls_dict))Stock.__module__=__name__
This makes a normal class object that works just like you expect:
>>>s=Stock('ACME',50,91.1)>>>s<stock.Stock object at 0x1006a9b10>>>>s.cost()4555.0>>>
A subtle facet of the solution is the assignment to Stock.__module__
after the call to types.new_class(). Whenever a class is defined,
its __module__ attribute contains the name of the module in which it
was defined. This name is used to produce the output made by methods
such as __repr__(). It’s also used by various libraries, such as
pickle. Thus, in order for the class you make to be “proper,” you
need to make sure this attribute is set accordingly.
If the class you want to create involves a different metaclass, it
would be specified in the third argument to types.new_class(). For
example:
>>>importabc>>>Stock=types.new_class('Stock',(),{'metaclass':abc.ABCMeta},...lambdans:ns.update(cls_dict))...>>>Stock.__module__=__name__>>>Stock<class'__main__.Stock'>>>>type(Stock)<class'abc.ABCMeta'>>>>
The third argument may also contain other keyword arguments. For example, a class definition like this
classSpam(Base,debug=True,typecheck=False):...
would translate to a new_class() call similar to this:
Spam=types.new_class('Spam',(Base,),{'debug':True,'typecheck':False},lambdans:ns.update(cls_dict))
The fourth argument to new_class() is the most mysterious, but it is a
function that receives the mapping object being used for the class
namespace as input. This is normally a dictionary, but it’s actually
whatever object gets returned by the __prepare__() method, as
described in Recipe 9.14. This function should add new
entries to the namespace using the update() method (as shown) or
other mapping operations.
Being able to manufacture new class objects can be useful in certain contexts.
One of the more familiar examples involves the collections.namedtuple()
function. For example:
>>>Stock=collections.namedtuple('Stock',['name','shares','price'])>>>Stock<class '__main__.Stock'>>>>
namedtuple() uses exec() instead of the technique shown here.
However, here is a simple variant that creates a class directly:
importoperatorimporttypesimportsysdefnamed_tuple(classname,fieldnames):# Populate a dictionary of field property accessorscls_dict={name:property(operator.itemgetter(n))forn,nameinenumerate(fieldnames)}# Make a __new__ function and add to the class dictdef__new__(cls,*args):iflen(args)!=len(fieldnames):raiseTypeError('Expected {} arguments'.format(len(fieldnames)))returntuple.__new__(cls,args)cls_dict['__new__']=__new__# Make the classcls=types.new_class(classname,(tuple,),{},lambdans:ns.update(cls_dict))# Set the module to that of the callercls.__module__=sys._getframe(1).f_globals['__name__']returncls
The last part of this code uses a so-called “frame hack” involving
sys._getframe() to obtain the module name of the caller. Another example of frame hacking appears in
Recipe 2.15.
The following example shows how the preceding code works:
>>>Point=named_tuple('Point',['x','y'])>>>Point<class '__main__.Point'>>>>p=Point(4,5)>>>len(p)2>>>p.x4>>>p.y5>>>p.x=2Traceback (most recent call last):File"<stdin>", line1, in<module>AttributeError:can't set attribute>>>('%s%s'%p)4 5>>>
One important aspect of the technique used in this recipe is its proper support for metaclasses. You might be inclined to create a class directly by instantiating a metaclass directly. For example:
Stock=type('Stock',(),cls_dict)
The problem is that this approach skips certain critical steps, such as
invocation of the metaclass __prepare__() method. By using
types.new_class() instead, you ensure that all of the necessary
initialization steps get carried out. For instance, the callback
function that’s given as the fourth argument to types.new_class()
receives the mapping object that’s returned by the __prepare__()
method.
If you only want to carry out the preparation step, use
types.prepare_class(). For example:
importtypesmetaclass,kwargs,ns=types.prepare_class('Stock',(),{'metaclass':type})
This finds the appropriate metaclass and invokes its __prepare__() method.
The metaclass, remaining keyword arguments, and prepared namespace are then
returned.
For more information, see PEP 3115, as well as the Python documentation.
You want to initialize parts of a class definition once at the time a class is defined, not when instances are created.
Performing initialization or setup actions at the time of class definition is a classic use of metaclasses. Essentially, a metaclass is triggered at the point of a definition, at which point you can perform additional steps.
Here is an example that uses this idea to create classes
similar to named tuples from the collections module:
importoperatorclassStructTupleMeta(type):def__init__(cls,*args,**kwargs):super().__init__(*args,**kwargs)forn,nameinenumerate(cls._fields):setattr(cls,name,property(operator.itemgetter(n)))classStructTuple(tuple,metaclass=StructTupleMeta):_fields=[]def__new__(cls,*args):iflen(args)!=len(cls._fields):raiseValueError('{} arguments required'.format(len(cls._fields)))returnsuper().__new__(cls,args)
This code allows simple tuple-based data structures to be defined, like this:
classStock(StructTuple):_fields=['name','shares','price']classPoint(StructTuple):_fields=['x','y']
Here’s how they work:
>>>s=Stock('ACME',50,91.1)>>>s('ACME', 50, 91.1)>>>s[0]'ACME'>>>s.name'ACME'>>>s.shares*s.price4555.0>>>s.shares=23Traceback (most recent call last):File"<stdin>", line1, in<module>AttributeError:can't set attribute>>>
In this recipe, the StructTupleMeta class takes the listing of
attribute names in the _fields class attribute and turns them
into property methods that access a particular tuple slot. The
operator.itemgetter() function creates an accessor function and
the property() function turns it into a property.
The trickiest part of this recipe is knowing when the different
initialization steps occur. The __init__() method in
StructTupleMeta is only called once for each class that is
defined. The cls argument is the class that has just been defined.
Essentially, the code is using the _fields class variable to take
the newly defined class and add some new parts to it.
The StructTuple class serves as a common base class for users to
inherit from. The __new__() method in that class is responsible for
making new instances. The use of __new__() here is a bit unusual,
but is partly related to the fact that we’re modifying the calling
signature of tuples so that we can create instances with code that
uses a normal-looking calling convention like this:
s=Stock('ACME',50,91.1)# OKs=Stock(('ACME',50,91.1))# Error
Unlike __init__(), the __new__() method gets triggered before an
instance is created. Since tuples are immutable, it’s not possible
to make any changes to them once they have been created. An __init__()
function gets triggered too late in the instance creation process to do what
we want. That’s why __new__() has been defined.
Although this is a short recipe, careful study will reward the reader with a deep insight about how Python classes are defined, how instances are created, and the points at which different methods of metaclasses and classes are invoked.
PEP 422 may provide an alternative means for performing the task described in this recipe. However, as of this writing, it has not been adopted or accepted. Nevertheless, it might be worth a look in case you’re working with a version of Python newer than Python 3.3.
You’ve learned about function argument annotations and you have a thought that you might be able to use them to implement multiple-dispatch (method overloading) based on types. However, you’re not quite sure what’s involved (or if it’s even a good idea).
This recipe is based on a simple observation—namely, that since Python allows arguments to be annotated, perhaps it might be possible to write code like this:
classSpam:defbar(self,x:int,y:int):('Bar 1:',x,y)defbar(self,s:str,n:int=0):('Bar 2:',s,n)s=Spam()s.bar(2,3)# Prints Bar 1: 2 3s.bar('hello')# Prints Bar 2: hello 0
Here is the start of a solution that does just that, using a combination of metaclasses and descriptors:
# multiple.pyimportinspectimporttypesclassMultiMethod:'''Represents a single multimethod.'''def__init__(self,name):self._methods={}self.__name__=namedefregister(self,meth):'''Register a new method as a multimethod'''sig=inspect.signature(meth)# Build a type signature from the method's annotationstypes=[]forname,parminsig.parameters.items():ifname=='self':continueifparm.annotationisinspect.Parameter.empty:raiseTypeError('Argument {} must be annotated with a type'.format(name))ifnotisinstance(parm.annotation,type):raiseTypeError('Argument {} annotation must be a type'.format(name))ifparm.defaultisnotinspect.Parameter.empty:self._methods[tuple(types)]=methtypes.append(parm.annotation)self._methods[tuple(types)]=methdef__call__(self,*args):'''Call a method based on type signature of the arguments'''types=tuple(type(arg)forarginargs[1:])meth=self._methods.get(types,None)ifmeth:returnmeth(*args)else:raiseTypeError('No matching method for types {}'.format(types))def__get__(self,instance,cls):'''Descriptor method needed to make calls work in a class'''ifinstanceisnotNone:returntypes.MethodType(self,instance)else:returnselfclassMultiDict(dict):'''Special dictionary to build multimethods in a metaclass'''def__setitem__(self,key,value):ifkeyinself:# If key already exists, it must be a multimethod or callablecurrent_value=self[key]ifisinstance(current_value,MultiMethod):current_value.register(value)else:mvalue=MultiMethod(key)mvalue.register(current_value)mvalue.register(value)super().__setitem__(key,mvalue)else:super().__setitem__(key,value)classMultipleMeta(type):'''Metaclass that allows multiple dispatch of methods'''def__new__(cls,clsname,bases,clsdict):returntype.__new__(cls,clsname,bases,dict(clsdict))@classmethoddef__prepare__(cls,clsname,bases):returnMultiDict()
To use this class, you write code like this:
classSpam(metaclass=MultipleMeta):defbar(self,x:int,y:int):('Bar 1:',x,y)defbar(self,s:str,n:int=0):('Bar 2:',s,n)# Example: overloaded __init__importtimeclassDate(metaclass=MultipleMeta):def__init__(self,year:int,month:int,day:int):self.year=yearself.month=monthself.day=daydef__init__(self):t=time.localtime()self.__init__(t.tm_year,t.tm_mon,t.tm_mday)
Here is an interactive session that verifies that it works:
>>>s=Spam()>>>s.bar(2,3)Bar 1: 2 3>>>s.bar('hello')Bar 2: hello 0>>>s.bar('hello',5)Bar 2: hello 5>>>s.bar(2,'hello')Traceback (most recent call last):File"<stdin>", line1, in<module>File"multiple.py", line42, in__call__raiseTypeError('No matching method for types {}'.format(types))TypeError:No matching method for types (<class 'int'>, <class 'str'>)>>># Overloaded __init__>>>d=Date(2012,12,21)>>># Get today's date>>>e=Date()>>>e.year2012>>>e.month12>>>e.day3>>>
Honestly, there might be too much magic going on in this recipe to make it applicable to real-world code. However, it does dive into some of the inner workings of metaclasses and descriptors, and reinforces some of their concepts. Thus, even though you might not apply this recipe directly, some of its underlying ideas might influence other programming techniques involving metaclasses, descriptors, and function annotations.
The main idea in the implementation is relatively simple. The
MutipleMeta metaclass uses its __prepare__() method to
supply a custom class dictionary as an instance of MultiDict. Unlike
a normal dictionary, MultiDict checks to see whether entries
already exist when items are set. If so, the duplicate entries get
merged together inside an instance of MultiMethod.
Instances of MultiMethod collect methods by building a mapping
from type signatures to functions. During construction, function
annotations are used to collect these signatures and build the
mapping. This takes place in the MultiMethod.register() method.
One critical part of this mapping is that for multimethods, types
must be specified on all of the arguments or else an error occurs.
To make MultiMethod instances emulate a callable, the __call__()
method is implemented. This method builds a type tuple from all of
the arguments except self, looks up the method in the internal map,
and invokes the appropriate method. The __get__() is required to
make MultiMethod instances operate correctly inside class
definitions. In the implementation, it’s being used to create proper
bound methods. For example:
>>>b=s.bar>>>b<boundmethodSpam.barof<__main__.Spamobjectat0x1006a46d0>>>>>b.__self__<__main__.Spamobjectat0x1006a46d0>>>>b.__func__<__main__.MultiMethodobjectat0x1006a4d50>>>>b(2,3)Bar1:23>>>b('hello')Bar2:hello0>>>
To be sure, there are a lot of moving parts to this recipe. However, it’s all a little unfortunate considering how many limitations there are. For one, the solution doesn’t work with keyword arguments: For example:
>>>s.bar(x=2,y=3)Traceback (most recent call last):File"<stdin>", line1, in<module>TypeError:__call__() got an unexpected keyword argument 'y'>>>s.bar(s='hello')Traceback (most recent call last):File"<stdin>", line1, in<module>TypeError:__call__() got an unexpected keyword argument 's'>>>
There might be some way to add such support, but it would require
a completely different approach to method mapping. The problem
is that the keyword arguments don’t arrive in any kind of particular
order. When mixed up with positional arguments, you simply get a
jumbled mess of arguments that you have to somehow sort out
in the __call__() method.
This recipe is also severely limited in its support for inheritance. For example, something like this doesn’t work:
classA:passclassB(A):passclassC:passclassSpam(metaclass=MultipleMeta):deffoo(self,x:A):('Foo 1:',x)deffoo(self,x:C):('Foo 2:',x)
The reason it fails is that the x:A annotation fails to match instances
that are subclasses (such as instances of B). For example:
>>>s=Spam()>>>a=A()>>>s.foo(a)Foo 1: <__main__.A object at 0x1006a5310>>>>c=C()>>>s.foo(c)Foo 2: <__main__.C object at 0x1007a1910>>>>b=B()>>>s.foo(b)Traceback (most recent call last):File"<stdin>", line1, in<module>File"multiple.py", line44, in__call__raiseTypeError('No matching method for types {}'.format(types))TypeError:No matching method for types (<class '__main__.B'>,)>>>
As an alternative to using metaclasses and annotations, it is possible to implement a similar recipe using decorators. For example:
importtypesclassmultimethod:def__init__(self,func):self._methods={}self.__name__=func.__name__self._default=funcdefmatch(self,*types):defregister(func):ndefaults=len(func.__defaults__)iffunc.__defaults__else0forninrange(ndefaults+1):self._methods[types[:len(types)-n]]=funcreturnselfreturnregisterdef__call__(self,*args):types=tuple(type(arg)forarginargs[1:])meth=self._methods.get(types,None)ifmeth:returnmeth(*args)else:returnself._default(*args)def__get__(self,instance,cls):ifinstanceisnotNone:returntypes.MethodType(self,instance)else:returnself
To use the decorator version, you would write code like this:
classSpam:@multimethoddefbar(self,*args):# Default method called if no matchraiseTypeError('No matching method for bar')@bar.match(int,int)defbar(self,x,y):('Bar 1:',x,y)@bar.match(str,int)defbar(self,s,n=0):('Bar 2:',s,n)
The decorator solution also suffers the same limitations as the previous implementation (namely, no support for keyword arguments and broken inheritance).
All things being equal, it’s probably best to stay away from multiple dispatch in general-purpose code. There are special situations where it might make sense, such as in programs that are dispatching methods based on some kind of pattern matching. For example, perhaps the visitor pattern described in Recipe 8.21 could be recast into a class that used multiple dispatch in some way. However, other than that, it’s usually never a bad idea to stick with a more simple approach (simply use methods with different names).
Ideas concerning different ways to implement multiple dispatch have floated around the Python community for years. As a decent starting point for that discussion, see Guido van Rossum’s blog post “Five-Minute Multimethods in Python”.
You are writing classes where you are repeatedly having to define property methods that perform common tasks, such as type checking. You would like to simplify the code so there is not so much code repetition.
Consider a simple class where attributes are being wrapped by property methods:
classPerson:def__init__(self,name,age):self.name=nameself.age=age@propertydefname(self):returnself._name@name.setterdefname(self,value):ifnotisinstance(value,str):raiseTypeError('name must be a string')self._name=value@propertydefage(self):returnself._age@age.setterdefage(self,value):ifnotisinstance(value,int):raiseTypeError('age must be an int')self._age=value
As you can see, a lot of code is being written simply to enforce some type assertions on attribute values. Whenever you see code like this, you should explore different ways of simplifying it. One possible approach is to make a function that simply defines the property for you and returns it. For example:
deftyped_property(name,expected_type):storage_name='_'+name@propertydefprop(self):returngetattr(self,storage_name)@prop.setterdefprop(self,value):ifnotisinstance(value,expected_type):raiseTypeError('{} must be a {}'.format(name,expected_type))setattr(self,storage_name,value)returnprop# Example useclassPerson:name=typed_property('name',str)age=typed_property('age',int)def__init__(self,name,age):self.name=nameself.age=age
This recipe illustrates an important feature of inner function or
closures—namely, their use in writing code that works a lot like a
macro. The typed_property() function in this example may look a
little weird, but it’s really just generating the property code for
you and returning the resulting property object. Thus, when it’s used
in a class, it operates exactly as if the code appearing inside
typed_property() was placed into the class definition itself. Even
though the property getter and setter methods are accessing local
variables such as name, expected_type, and storage_name, that is
fine—those values are held behind the scenes in a closure.
This recipe can be tweaked in an interesting manner using the
functools.partial() function. For example, you can do this:
fromfunctoolsimportpartialString=partial(typed_property,expected_type=str)Integer=partial(typed_property,expected_type=int)# Example:classPerson:name=String('name')age=Integer('age')def__init__(self,name,age):self.name=nameself.age=age
Here the code is starting to look a lot like some of the type system descriptor code shown in Recipe 8.13.
One of the most straightforward ways to write a new context manager
is to use the @contextmanager decorator in the contextlib module.
Here is an example of a context manager that times the execution of
a code block:
importtimefromcontextlibimportcontextmanager@contextmanagerdeftimethis(label):start=time.time()try:yieldfinally:end=time.time()('{}: {}'.format(label,end-start))# Example usewithtimethis('counting'):n=10000000whilen>0:n-=1
In the timethis() function, all of the code prior to the yield executes
as the __enter__() method of a context manager. All of the code after the yield
executes as the __exit__() method. If there was an exception, it is raised at the
yield statement.
Here is a slightly more advanced context manager that implements a kind of transaction on a list object:
@contextmanagerdeflist_transaction(orig_list):working=list(orig_list)yieldworkingorig_list[:]=working
The idea here is that changes made to a list only take effect if an entire code block runs to completion with no exceptions. Here is an example that illustrates:
>>>items=[1,2,3]>>>withlist_transaction(items)asworking:...working.append(4)...working.append(5)...>>>items[1, 2, 3, 4, 5]>>>withlist_transaction(items)asworking:...working.append(6)...working.append(7)...raiseRuntimeError('oops')...Traceback (most recent call last):File"<stdin>", line4, in<module>RuntimeError:oops>>>items[1, 2, 3, 4, 5]>>>
Normally, to write a context manager, you define a class with an __enter__() and
__exit__() method, like this:
importtimeclasstimethis:def__init__(self,label):self.label=labeldef__enter__(self):self.start=time.time()def__exit__(self,exc_ty,exc_val,exc_tb):end=time.time()('{}: {}'.format(self.label,end-self.start))
Although this isn’t hard, it’s a lot more tedious than writing a simple
function using @contextmanager.
@contextmanager is really only used for writing self-contained context-management functions. If you have some object (e.g., a file, network connection, or
lock) that needs to support the with statement, you still need to implement the
__enter__() and __exit__() methods separately.
You are using exec() to execute a fragment of code in the scope of
the caller, but after execution, none of its results seem to be
visible.
To better understand the problem, try a little experiment. First, execute a fragment of code in the global namespace:
>>>a=13>>>exec('b = a + 1')>>>(b)14>>>
Now, try the same experiment inside a function:
>>>deftest():...a=13...exec('b = a + 1')...(b)...>>>test()Traceback (most recent call last):File"<stdin>", line1, in<module>File"<stdin>", line4, intestNameError:global name 'b' is not defined>>>
As you can see, it fails with a NameError almost as if the exec()
statement never actually executed. This can be a problem if you ever
want to use the result of the exec() in a later calculation.
To fix this kind of problem, you need to use the locals() function
to obtain a dictionary of the local variables prior to the call to
exec(). Immediately afterward, you can extract modified values from
the locals dictionary. For example:
>>>deftest():...a=13...loc=locals()...exec('b = a + 1')...b=loc['b']...(b)...>>>test()14>>>
Correct use of exec() is actually quite tricky in practice. In fact, in
most situations where you might be considering the use of exec(), a more elegant
solution probably exists (e.g., decorators, closures, metaclasses, etc.).
However, if you still must use exec(), this recipe outlines some subtle
aspects of using it correctly. By
default, exec() executes code in the local and global scope of the
caller. However, inside functions, the local scope passed to exec()
is a dictionary that is a copy of the actual local variables.
Thus, if the code in exec() makes any kind of modification, that
modification is never reflected in the actual local variables. Here
is another example that shows this effect:
>>>deftest1():...x=0...exec('x += 1')...(x)...>>>test1()0>>>
When you call locals() to obtain the local variables, as shown in the
solution, you get the copy of the locals that is passed to exec().
By inspecting the value of the dictionary after execution, you
can obtain the modified values. Here is an experiment that shows this:
>>>deftest2():...x=0...loc=locals()...('before:',loc)...exec('x += 1')...('after:',loc)...('x =',x)...>>>test2()before: {'x': 0}after: {'loc': {...}, 'x': 1}x = 0>>>
Carefully observe the output of the last step. Unless you copy
the modified value from loc back to x, the variable remains
unchanged.
With any use of locals(), you need to be careful about the order
of operations. Each time it is invoked, locals() will take the
current value of local variables and overwrite the corresponding
entries in the dictionary. Observe the outcome of this experiment:
>>>deftest3():...x=0...loc=locals()...(loc)...exec('x += 1')...(loc)...locals()...(loc)...>>>test3(){'x': 0}{'loc': {...}, 'x': 1}{'loc': {...}, 'x': 0}>>>
Notice how the last call to locals() caused x to be overwritten.
As an alternative to using locals(), you might make your own dictionary
and pass it to exec(). For example:
>>>deftest4():...a=13...loc={'a':a}...glb={}...exec('b = a + 1',glb,loc)...b=loc['b']...(b)...>>>test4()14>>>
For most uses of exec(), this is probably good practice. You
just need to make sure that the global and local dictionaries are
properly initialized with names that the executed code will access.
Last, but not least, before using exec(), you might ask yourself if
other alternatives are available. Many problems where you might
consider the use of exec() can be replaced by closures, decorators,
metaclasses, or other metaprogramming features.
Most programmers know that Python can evaluate or execute source code provided in the form of a string. For example:
>>>x=42>>>eval('2 + 3*4 + x')56>>>exec('for i in range(10): print(i)')0123456789>>>
However, the ast module can be used to compile Python source code
into an abstract syntax tree (AST) that can be analyzed. For example:
>>>importast>>>ex=ast.parse('2 + 3*4 + x',mode='eval')>>>ex<_ast.Expression object at 0x1007473d0>>>>ast.dump(ex)"Expression(body=BinOp(left=BinOp(left=Num(n=2), op=Add(),right=BinOp(left=Num(n=3), op=Mult(), right=Num(n=4))), op=Add(),right=Name(id='x', ctx=Load())))">>>top=ast.parse('for i in range(10): print(i)',mode='exec')>>>top<_ast.Module object at 0x100747390>>>>ast.dump(top)"Module(body=[For(target=Name(id='i', ctx=Store()),iter=Call(func=Name(id='range', ctx=Load()), args=[Num(n=10)],keywords=[], starargs=None, kwargs=None),body=[Expr(value=Call(func=Name(id='print', ctx=Load()),args=[Name(id='i', ctx=Load())], keywords=[], starargs=None,kwargs=None))], orelse=[])])">>>
Analyzing the source tree requires a bit of study on your part, but it
consists of a collection of AST nodes. The easiest way to work with
these nodes is to define a visitor class that implements various
visit_NodeName() methods where NodeName() matches the node of
interest. Here is an example of such a class that records information
about which names are loaded, stored, and deleted.
importastclassCodeAnalyzer(ast.NodeVisitor):def__init__(self):self.loaded=set()self.stored=set()self.deleted=set()defvisit_Name(self,node):ifisinstance(node.ctx,ast.Load):self.loaded.add(node.id)elifisinstance(node.ctx,ast.Store):self.stored.add(node.id)elifisinstance(node.ctx,ast.Del):self.deleted.add(node.id)# Sample usageif__name__=='__main__':# Some Python codecode='''for i in range(10):print(i)del i'''# Parse into an ASTtop=ast.parse(code,mode='exec')# Feed the AST to analyze name usagec=CodeAnalyzer()c.visit(top)('Loaded:',c.loaded)('Stored:',c.stored)('Deleted:',c.deleted)
If you run this program, you’ll get output like this:
Loaded: {'i', 'range', 'print'}
Stored: {'i'}
Deleted: {'i'}Finally, ASTs can be compiled and executed using the compile()
function. For example:
>>>exec(compile(top,'<stdin>','exec'))0123456789>>>
The fact that you can analyze source code and get information from it
could be the start of writing various code analysis, optimization, or
verification tools. For instance, instead of just blindly passing
some fragment of code into a function like exec(), you could turn it
into an AST first and look at it in some detail to see what it’s doing.
You could also write tools that look at the entire source code
for a module and perform some sort of static analysis over it.
It should be noted that it is also possible to rewrite the AST to represent new code if you really know what you’re doing. Here is an example of a decorator that lowers globally accessed names into the body of a function by reparsing the function body’s source code, rewriting the AST, and recreating the function’s code object:
# namelower.pyimportastimportinspect# Node visitor that lowers globally accessed names into# the function body as local variables.classNameLower(ast.NodeVisitor):def__init__(self,lowered_names):self.lowered_names=lowered_namesdefvisit_FunctionDef(self,node):# Compile some assignments to lower the constantscode='__globals = globals()\n'code+='\n'.join("{0} = __globals['{0}']".format(name)fornameinself.lowered_names)code_ast=ast.parse(code,mode='exec')# Inject new statements into the function bodynode.body[:0]=code_ast.body# Save the function objectself.func=node# Decorator that turns global names into localsdeflower_names(*namelist):deflower(func):srclines=inspect.getsource(func).splitlines()# Skip source lines prior to the @lower_names decoratorforn,lineinenumerate(srclines):if'@lower_names'inline:breaksrc='\n'.join(srclines[n+1:])# Hack to deal with indented codeifsrc.startswith((' ','\t')):src='if 1:\n'+srctop=ast.parse(src,mode='exec')# Transform the ASTcl=NameLower(namelist)cl.visit(top)# Execute the modified ASTtemp={}exec(compile(top,'','exec'),temp,temp)# Pull out the modified code objectfunc.__code__=temp[func.__name__].__code__returnfuncreturnlower
To use this code, you would write code such as the following:
INCR=1@lower_names('INCR')defcountdown(n):whilen>0:n-=INCR
The decorator rewrites the source code of the countdown() function to look like this:
defcountdown(n):__globals=globals()INCR=__globals['INCR']whilen>0:n-=INCR
In a performance test, it makes the function run about 20% faster.
Now, should you go applying this decorator to all of your functions? Probably not. However, it’s a good illustration of some very advanced things that might be possible through AST manipulation, source code manipulation, and other techniques.
This recipe was inspired by a similar recipe at ActiveState that worked by manipulating Python’s byte code. Working with the AST is a higher-level approach that might be a bit more straightforward. See the next recipe for more information about byte code.
You want to know in detail what your code is doing under the covers by disassembling it into lower-level byte code used by the interpreter.
The dis module can be used to output a disassembly of any Python
function. For example:
>>>defcountdown(n):...whilen>0:...('T-minus',n)...n-=1...('Blastoff!')...>>>importdis>>>dis.dis(countdown)20SETUP_LOOP39(to42)>>3LOAD_FAST0(n)6LOAD_CONST1(0)9COMPARE_OP4(>)12POP_JUMP_IF_FALSE41315LOAD_GLOBAL0()18LOAD_CONST2('T-minus')21LOAD_FAST0(n)24CALL_FUNCTION2(2positional,0keywordpair)27POP_TOP428LOAD_FAST0(n)31LOAD_CONST3(1)34INPLACE_SUBTRACT35STORE_FAST0(n)38JUMP_ABSOLUTE3>>41POP_BLOCK5>>42LOAD_GLOBAL0()45LOAD_CONST4('Blastoff!')48CALL_FUNCTION1(1positional,0keywordpair)51POP_TOP52LOAD_CONST0(None)55RETURN_VALUE>>>
The dis module can be useful if you ever need to study what’s happening in
your program at a very low level (e.g., if you’re trying to understand
performance characteristics).
The raw byte code interpreted by the dis() function is available on functions
as follows:
>>>countdown.__code__.co_codeb"x'\x00|\x00\x00d\x01\x00k\x04\x00r)\x00t\x00\x00d\x02\x00|\x00\x00\x83\x02\x00\x01|\x00\x00d\x03\x008}\x00\x00q\x03\x00Wt\x00\x00d\x04\x00\x83\x01\x00\x01d\x00\x00S">>>
If you ever want to interpret this code yourself, you would need to
use some of the constants defined in the opcode module. For example:
>>>c=countdown.__code__.co_code>>>importopcode>>>opcode.opname[c[0]]>>>opcode.opname[c[0]]'SETUP_LOOP'>>>opcode.opname[c[3]]'LOAD_FAST'>>>
Ironically, there is no function in the dis module that makes it
easy for you to process the byte code in a programmatic way.
However, this generator function will take the raw byte code
sequence and turn it into opcodes and arguments.
importopcodedefgenerate_opcodes(codebytes):extended_arg=0i=0n=len(codebytes)whilei<n:op=codebytes[i]i+=1ifop>=opcode.HAVE_ARGUMENT:oparg=codebytes[i]+codebytes[i+1]*256+extended_argextended_arg=0i+=2ifop==opcode.EXTENDED_ARG:extended_arg=oparg*65536continueelse:oparg=Noneyield(op,oparg)
To use this function, you would use code like this:
>>>forop,opargingenerate_opcodes(countdown.__code__.co_code):...(op,opcode.opname[op],oparg)...120 SETUP_LOOP 39124 LOAD_FAST 0100 LOAD_CONST 1107 COMPARE_OP 4114 POP_JUMP_IF_FALSE 41116 LOAD_GLOBAL 0100 LOAD_CONST 2124 LOAD_FAST 0131 CALL_FUNCTION 21 POP_TOP None124 LOAD_FAST 0100 LOAD_CONST 356 INPLACE_SUBTRACT None125 STORE_FAST 0113 JUMP_ABSOLUTE 387 POP_BLOCK None116 LOAD_GLOBAL 0100 LOAD_CONST 4131 CALL_FUNCTION 11 POP_TOP None100 LOAD_CONST 083 RETURN_VALUE None>>>
It’s a little-known fact, but you can replace the raw byte code of any function that you want. It takes a bit of work to do it, but here’s an example of what’s involved:
>>>defadd(x,y):...returnx+y...>>>c=add.__code__>>>c<code object add at 0x1007beed0, file "<stdin>", line 1>>>>c.co_codeb'|\x00\x00|\x01\x00\x17S'>>>>>># Make a completely new code object with bogus byte code>>>importtypes>>>newbytecode=b'xxxxxxx'>>>nc=types.CodeType(c.co_argcount,c.co_kwonlyargcount,...c.co_nlocals,c.co_stacksize,c.co_flags,newbytecode,c.co_consts,...c.co_names,c.co_varnames,c.co_filename,c.co_name,...c.co_firstlineno,c.co_lnotab)>>>nc<code object add at 0x10069fe40, file "<stdin>", line 1>>>>add.__code__=nc>>>add(2,3)Segmentation fault
Having the interpreter crash is a pretty likely outcome of pulling a crazy stunt like this. However, developers working on advanced optimization and metaprogramming tools might be inclined to rewrite byte code for real. This last part illustrates how to do it. See this code on ActiveState for another example of such code in action.