7. Classes and Object-Oriented Programming

Classes are used to create new kinds of objects. This chapter covers the details of classes—but is not intended to be an in-depth reference on object-oriented programming and design. Some programming patterns common in Python are discussed, as well as ways that you can customize classes to behave in interesting ways. The overall structure of this chapter is top-down. High-level concepts and techniques for using classes are described first. In later parts of the chapter, the material gets more technical and focused on internal implementation.

7.1 Objects

Almost all code in Python involves creating and performing actions on objects. For example, you might make a string object and manipulate it as follows:

>>> s = "Hello World"
>>> s.upper()
'HELLO WORLD'
>>> s.replace('Hello', 'Hello Cruel')
'Hello Cruel World'
>>> s.split()
['Hello', 'World']
>>>

or a list object:

>>> names = ['Paula', 'Thomas']
>>> names.append('Lewis')
>>> names
['Paula', 'Thomas', 'Lewis']
>>> names[1] = 'Tom'
>>>

An essential feature of each object is that it usually has some kind of state—the characters of a string, the elements of a list, and so on, as well as methods that operate on that state. The methods are invoked through the object itself—as if they are functions attached to the object via the dot (.) operator.

Objects always have an associated type. You can view it using type():

>>> type(names)
<class 'list'>
>>>

An object is said to be an instance of its type. For example, names is an instance of list.

7.2 The class Statement

New objects are defined using the class statement. A class typically consists of a collection of functions that make up the methods. Here’s an example:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __repr__(self):
        return f'Account({self.owner!r}, {self.balance!r})'

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.balance -= amount

    def inquiry(self):
        return self.balance

It’s important to note that a class statement by itself doesn’t create any instances of the class. For example, no accounts are actually created in the preceding example. Rather, a class merely holds the methods that will be available on the instances created later. You might think of it as a blueprint.

The functions defined inside a class are known as methods. An instance method is a function that operates on an instance of the class, which is passed as the first argument. By convention, this argument is called self. In the preceding example, deposit(), withdraw(), and inquiry() are examples of instance methods.

The __init__() and __repr__() methods of the class are examples of so-called special or magic methods. These methods have special meaning to the interpreter runtime. The __init__() method is used to initialize state when a new instance is created. The __repr__() method returns a string for viewing an object. Defining this method is optional, but doing so simplifies debugging and makes it easier to view objects from the interactive prompt.

A class definition may optionally include a documentation string and type hints. For example:

class Account:
    '''
    A simple bank account
    '''
    owner: str
    balance: float

    def __init__(self, owner:str, balance:float):
        self.owner = owner
        self.balance = balance

    def __repr__(self):
        return f'Account({self.owner!r}, {self.balance!r})'

    def deposit(self, amount:float):
        self.balance += amount

    def withdraw(self, amount:float):
        self.balance -= amount

    def inquiry(self) -> float:
        return self.balance

Type hints do not change any aspect of how a class works—that is, they do not introduce any extra checking or validation. It is purely metadata that might be useful for third-party tools or IDEs, or used by certain advanced programming techniques. They are not used in most examples that follow.

7.3 Instances

Instances of a class are created by calling a class object as a function. This creates a new instance that is then passed to the __init__() method. The arguments to __init__() consist of the newly created instance self along with the arguments supplied in calling the class object. For example:

# Create a few accounts

a = Account('Guido', 1000.0)
# Calls Account.__init__(a, 'Guido', 1000.0)

b = Account('Eva', 10.0)
# Calls Account.__init__(b, 'Eva', 10.0)

Inside __init__(), attributes are saved on the instance by assigning to self. For example, self.owner = owner is saving an attribute on the instance. Once the newly created instance has been returned, these attributes, as well as methods of the class, are accessed using the dot (.) operator:

a.deposit(100.0)        # Calls Account.deposit(a, 100.0)
b.withdraw(50.00        # Calls Account.withdraw(b, 50.0)
owner = a.owner         # Get account owner

It’s important to emphasize that each instance has its own state. You can view instance variables using the vars() function. For example:

>>> a = Account('Guido', 1000.0)
>>> b = Account('Eva', 10.0)
>>> vars(a)
{'owner': 'Guido', 'balance': 1000.0}
>>> vars(b)
{'owner': 'Eva', 'balance': 10.0}
>>>

Notice that the methods do not appear here. The methods are found on the class instead. Every instance keeps a link to its class via its associated type. For example:

>>> type(a)
<class 'Account'>
>>> type(b)
<class 'Account'>
>>> type(a).deposit
<function Account.deposit at 0x10a032158>
>>> type(a).inquiry
<function Account.inquiry at 0x10a032268>
>>>

A later section discusses the implementation details of attribute binding and the relationship between instances and classes.

7.4 Attribute Access

There are only three basic operations that can be performed on an instance: getting, setting, and deleting an attribute. For example:

>>> a = Account('Guido', 1000.0)
>>> a.owner               # get
'Guido'
>>> a.balance = 750.0    # set
>>> del a.balance        # delete
>>> a.balance
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Account' object has no attribute 'balance'
>>>

Everything in Python is a dynamic process with very few restrictions. If you want to add a new attribute to an object after it’s been created, you’re free to do that. For example:

>>> a = Account('Guido', 1000.0)
>>> a.creation_date = '2019-02-14'
>>> a.nickname = 'Former BDFL'
>>> a.creation_date
'2019-02-14'
>>>

Instead of using dot (.) to perform these operations, you can supply an attribute name as a string to the getattr(), setattr(), and delattr() functions. The hasattr() function tests for the existence of an attribute. For example:

>>> a = Account('Guido', 1000.0)
>>> getattr(a, 'owner')
'Guido'
>>> setattr(a, 'balance', 750.0)
>>> delattr(a, 'balance')
>>> hasattr(a, 'balance')
False
>>> getattr(a, 'withdraw')(100)    # Method call
>>> a
Account('Guido', 650.0)
>>>

a.attr and getattr(a, 'attr') are interchangeable, so getattr(a, 'withdraw')(100) is the same as a.withdraw(100). It matters not that withdraw() is a method.

The getattr() function is notable for taking an optional default value. If you wanted to look up an attribute that may or may not exist, you could do this:

>>> a = Account('Guido', 1000.0)
>>> getattr(s, 'balance', 'unknown')
1000.0
>>> getattr(s, 'creation_date', 'unknown')
'unknown'
>>>

When you access a method as an attribute, you get an object known as a bound method. For example:

>>> a = Account('Guido', 1000.0)
>>> w = a.withdraw
>>> w
<bound method Account.withdraw of Account('Guido', 1000.0)>
>>> w(100)
>>> a
Account('Guido', 900.0)
>>>

A bound method is an object that contains both an instance (the self) and the function that implements the method. When you call a bound method by adding parentheses and arguments, it executes the method, passing the attached instance as the first argument. For example, calling w(100) above turns into a call to Account.withdraw(a, 100).

7.5 Scoping Rules

Although classes define an isolated namespace for the methods, that namespace does not serve as a scope for resolving names used inside methods. Therefore, when you’re implementing a class, references to attributes and methods must be fully qualified. For example, in methods you always reference attributes of the instance through self. Thus, you use self.balance, not balance. This also applies if you want to call a method from another method. For example, suppose you want to implement withdraw() in terms of depositing a negative amount:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __repr__(self):
        return f'Account({self.owner!r}, {self.balance!r})'

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.deposit(-amount)    # Must use self.deposit()

    def inquiry(self):
        return self.balance

The lack of a class-level scope is one area where Python differs from C++ or Java. If you have used those languages, the self parameter in Python is the same as the so-called “this” pointer, except that in Python you always have to use it explicitly.

7.6 Operator Overloading and Protocols

In Chapter 4, we discussed Python’s data model. Particular attention was given to the so-called special methods that implement Python operators and protocols. For example, the len(obj) function calls obj.__len__() and obj[n] calls obj.__getitem__(n).

When defining new classes, it is common to define some of these methods. The __repr__() method in the Account class was one such method for improving debugging output. You might define more of these methods if you’re creating something more complicated, such as a custom container. For example, suppose you wanted to make a portfolio of accounts:

class AccountPortfolio:
    def __init__(self):
        self.accounts = []

    def add_account(self, account):
        self.accounts.append(account)

    def total_funds(self):
        return sum(account.inquiry() for account in self.accounts)

    def __len__(self):
        return len(self.accounts)

    def __getitem__(self, index):
        return self.accounts[index]

    def __iter__(self):
        return iter(self.accounts)

# Example
port = AccountPortfolio()
port.add_account(Account('Guido', 1000.0))
port.add_account(Account('Eva', 50.0))

print(port.total_funds())    # -> 1050.0
len(port)                    # -> 2

# Print the accounts
for account in port:
    print(account)

# Access an individual account by index
port[1].inquiry()            # -> 50.0

The special methods that appear at the end, such as __len__(), __getitem__(), and __iter__(), make an AccountPortfolio work with Python operators such as indexing and iteration.

Sometimes you’ll hear the word “Pythonic,” as in “this code is Pythonic.” The term is informal, but it usually refers to whether or not an object plays nicely with the rest of the Python environment. This implies supporting—to the extent that it makes sense—core Python features such as iteration, indexing, and other operations. You almost always do this by having your class implement predefined special methods as described in Chapter 4.

7.7 Inheritance

Inheritance is a mechanism for creating a new class that specializes or modifies the behavior of an existing class. The original class is called a base class, superclass, or parent class. The new class is called a derived class, child class, subclass, or subtype. When a class is created via inheritance, it inherits the attributes defined by its base classes. However, a derived class may redefine any of these attributes and add new attributes of its own.

Inheritance is specified with a comma-separated list of base-class names in the class statement. If there is no specified base class, a class implicitly inherits from object. object is a class that is the root of all Python objects; it provides the default implementation of some common methods such as __str__() and __repr__().

One use of inheritance is to extend an existing class with new methods. For example, suppose you want to add a panic() method to Account that would withdraw all funds. Here’s how:

class MyAcount(Account):
    def panic(self):
        self.withdraw(self.balance)

# Example
a = MyAcount('Guido', 1000.0)
a.withdraw(23.0)         # a.balance = 977.0
a.panic()                # a.balance = 0

Inheritance can also be used to redefine the already existing methods. For example, here’s a specialized version of Account that redefines the inquiry() method to periodically overstate the balance with the hope that someone not paying close attention will overdraw their account and incur a big penalty when making a payment on their subprime mortgage:

import random
class EvilAccount(Account):
    def inquiry(self):
        if random.randint(0,4) == 1:
           return self.balance * 1.10
        else:
           return self.balance

a = EvilAccount('Guido', 1000.0)
a.deposit(10.0)          # Calls Account.deposit(a, 10.0)
available = a.inquiry()  # Calls EvilAccount.inquiry(a)

In this example, instances of EvilAccount are identical to instances of Account except for the redefined inquiry() method.

Occasionally, a derived class would reimplement a method but also need to call the original implementation. A method can explicitly call the original method using super():

class EvilAccount(Account):
    def inquiry(self):
        if random.randint(0,4) == 1:
           return 1.10 * super().inquiry()
        else:
           return super().inquiry()

In this example, super() allows you to access a method as it was previously defined. The super().inquiry() call is using the original definition of inquiry() before it got redefined by EvilAccount.

It’s less common, but inheritance might also be used to add additional attributes to instances. Here’s how you could make the 1.10 factor in the above example an instance-level attribute that could be adjusted:

class EvilAccount(Account):
    def __init__(self, owner, balance, factor):
        super().__init__(owner, balance)
        self.factor = factor

    def inquiry(self):
        if random.randint(0,4) == 1:
           return self.factor * super().inquiry()
        else:
           return super().inquiry()

A tricky issue with adding attributes is dealing with the existing __init__() method. In this example, we define a new version of __init__() that includes our additional instance variable factor. However, when __init__() is redefined, it is the responsibility of the child to initialize its parent using super().__init__() as shown. If you forget to do this, you’ll end up with a half-initialized object and everything will break. Since initialization of the parent requires additional arguments, those still must be passed to the child __init__() method.

Inheritance can break code in subtle ways. Consider the __repr__() method of the Account class:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __repr__(self):
        return f'Account({self.owner!r}, {self.balance!r})'

The purpose of this method is to assist in debugging by making nice output. However, the method is hardcoded to use the name Account. If you start using inheritance, you’ll find that the output is wrong:

>>> class EvilAccount(Account):
...     pass
...
>>> a = EvilAccount('Eva', 10.0)
>>> a
Account('Eva', 10.0)     # Notice misleading output
>>> type(a)
<class 'EvilAccount'>
>>>

To fix this, you need to modify the __repr__() method to use the proper type name. for example:

class Account:
    ...
    def __repr__(self):
        return f'{type(self).__name__}({self.owner!r}, {self.balance!r})'

Now you’ll see more accurate output. Inheritance is not used with every class, but if it’s an anticipated use case of the class you’re writing, you need to pay attention to details like this. As a general rule, avoid the hardcoding of class names.

Inheritance establishes a relationship in the type system where any child class will type-check as the parent class. For example:

>>> a = EvilAccount('Eva', 10)
>>> type(a)
<class 'EvilAccount'>
>>> isinstance(a, Account)
True
>>>

This is the so-called “is a” relationship: EvilAccount is an Account. Sometimes, the “is a” inheritance relationship is used to define object type ontologies or taxonomies. For example:

class Food:
    pass

class Sandwich(Food):
    pass

class RoastBeef(Sandwich):
    pass

class GrilledCheese(Sandwich):
    pass

class Taco(Food):
    pass

In practice, organizing objects in this manner can be quite difficult and fraught with peril. Suppose you wanted to add a HotDog class to the above hierarchy. Where does it go? Given that a hot dog has a bun, you might be inclined to make it a subclass of Sandwich. However, based on the overall curved shape of the bun with a tasty filling inside, perhaps a hot dog is really more like a Taco. Maybe you decide to make it a subclass of both:

class HotDog(Sandwich, Taco):
    pass

At this point, everyone’s head is exploding and the office is embroiled in a heated argument. This might as well be a good time to mention that Python supports multiple inheritance. To do that, list more than one class as a parent. The resulting child class will inherit all of the combined features of the parents. See Section 7.19 for more about multiple inheritance.

7.8 Avoiding Inheritance via Composition

One concern with inheritance is what’s known as implementation inheritance. To illustrate, suppose you wanted to make a stack data structure with push and pop operations. A quick way to do that would be to inherit from a list and add a new method to it:

class Stack(list):
    def push(self, item):
        self.append(item)

# Example
s = Stack()
s.push(1)
s.push(2)
s.push(3)
s.pop()     # -> 3
s.pop()     # -> 2

Sure enough, this data structure works like a stack, but it also has every other feature of lists—insertion, sorting, slice reassignment, and so forth. This is implementation inheritance—you used inheritance to reuse some code upon which you built something else, but you also got a lot of features that aren’t pertinent to the problem actually being solved. Users will probably find the object strange. Why does a stack have methods for sorting?

A better approach is composition. Instead of building a stack by inheriting from a list, you should build a stack as an independent class that happens to have a list contained in it. The fact that there’s a list inside is an implementation detail. For example:

class Stack:
    def __init__(self):
        self._items = list()

    def push(self, item):
        self._items.append(item)

    def pop(self):
        return self._items.pop()

    def __len__(self):
        return len(self._items)

# Example use
s = Stack()
s.push(1)
s.push(2)
s.push(3)
s.pop()    # -> 3
s.pop()    # -> 2

This object works exactly the same as before, but it’s focused exclusively on just being a stack. There are no extraneous list methods or nonstack features. There is much more clarity of purpose.

A slight extension of this implementation might accept the internal list class as an optional argument:

class Stack:
    def __init__(self, *, container=None):
        if container is None:
            container = list()
        self._items = container

    def push(self, item):
        self._items.append(item)

    def pop(self):
        return self._items.pop()

    def __len__(self):
        return len(self._items)

One benefit of this approach is that it promotes loose coupling of components. For example, you might want to make a stack that stores its elements in a typed array instead of a list. Here’s how you could do it:

import array

s = Stack(container=array.array('i'))
s.push(42)
s.push(23)
s.push('a lot')     # TypeError.

This is also an example of what’s known as dependency injection. Instead of hardwiring Stack to depend on list, you can make it depend on any container a user decides to pass in, provided it implements the required interface.

More broadly, making the internal list a hidden implementation detail is related to the problem of data abstraction. Perhaps you later decide that you don’t even want to use a list. The above design makes that easy to change. For example, if you change the implementation to use linked tuples as follows, the users of Stack won’t even notice:

class Stack:
    def __init__(self):
        self._items = None
        self._size = 0

    def push(self, item):
        self._items = (item, self._items)
        self._size += 1

    def pop(self):
        (item, self._items) = self._items
        self._size -= 1
        return item

    def __len__(self):
        return self._size

To decide whether to use inheritance or not, you should step back and ask yourself if the object you’re building is a specialized version of the parent class or if you are merely using it as a component in building something else. If it’s the latter, don’t use inheritance.

7.9 Avoiding Inheritance via Functions

Sometimes you might find yourself writing classes with just a single method that needs to be customized. For example, maybe you wrote a data parsing class like this:

class DataParser:
    def parse(self, lines):
        records = []
        for line in lines:
            row = line.split(',')
            record = self.make_record(row)
            records.append(row)
        return records

    def make_record(self, row):
        raise NotImplementedError()

class PortfolioDataParser(DataParser):
    def make_record(self, row):
        return {
           'name': row[0],
           'shares': int(row[1]),
           'price': float(row[2])
        }

parser = PortfolioDataParser()
data = parser.parse(open('portfolio.csv'))

There is too much plumbing going on here. If you’re writing a lot of single-method classes, consider using functions instead. For example:

def parse_data(lines, make_record):
    records = []
    for line in lines:
        row = line.split(',')
        record = make_record(row)
        records.append(row)
    return records

def make_dict(row):
    return {
        'name': row[0],
        'shares': int(row[1]),
        'price': float(row[2])
    }

data = parse_data(open('portfolio.csv'), make_dict)

This code is much simpler and just as flexible, plus simple functions are easier to test. If there’s a need to expand it into classes, you can always do it later. Premature abstraction is often not a good thing.

7.10 Dynamic Binding and Duck Typing

Dynamic binding is the runtime mechanism that Python uses to find the attributes of objects. This is what allows Python to work with instances without regard for their type. In Python, variable names do not have an associated type. Thus, the attribute binding process is independent of what kind of object obj is. If you make a lookup, such as obj.name, it will work on any obj whatsoever that happens to have a name attribute. This behavior is sometimes referred to as duck typing—in reference to the adage “if it looks like a duck, quacks like a duck, and walks like a duck, then it’s a duck.”

Python programmers often write programs that rely on this behavior. For example, if you want to make a customized version of an existing object, you can either inherit from it, or you can create a completely new object that looks and acts like it but is otherwise unrelated. This latter approach is often used to maintain loose coupling of program components. For example, code may be written to work with any kind of object whatsoever as long as it has a certain set of methods. One of the most common examples is with various iterable objects defined in the standard library. There are all sorts of objects that work with the for loop to produce values—lists, files, generators, strings, and so on. However, none of these inherit from any kind of special Iterable base class. They merely implement the methods required to perform iteration—and it all works.

7.11 The Danger of Inheriting from Built-in Types

Python allows inheritance from built-in types. However, doing so invites danger. As an example, if you decide to subclass dict in order to force all keys to be uppercase, you might redefine the __setitem__() method like this:

class udict(dict):
    def __setitem__(self, key, value):
        super().__setitem__(key.upper(), value)

Indeed, it initially even seems to work:

>>> u = udict()
>>> u['name'] = 'Guido'
>>> u['number'] = 37
>>> u
{ 'NAME': 'Guido', 'NUMBER': 37 }
>>>

Further use, however, reveals that it only seems to work. In fact, it doesn’t really seem to work at all:

>>> u = udict(name='Guido', number=37)
>>> u
{ 'name': 'Guido', 'number': 37 }
>>> u.update(color='blue')
>>> u
{ 'name': 'Guido', 'number': 37, 'color': 'blue' }
>>>

At issue here is the fact that Python’s built-in types aren’t implemented like normal Python classes—they’re implemented in C. Most of the methods operate in the world of C. For example, the dict.update() method directly manipulates the dictionary data without ever routing through the redefined __setitem__() method in your custom udict class above.

The collections module has special classes UserDict, UserList, and UserString that can be used to make safe subclasses of dict, list, and str types. For example, you’ll find that this solution works a lot better:

from collections import UserDict

class udict(UserDict):
    def __setitem__(self, key, value):
        super().__setitem__(key.upper(), value)

Here’s an example of this new version in action:

>>> u = udict(name='Guido', num=37)
>>> u.update(color='Blue')
>>> u
{'NAME': 'Guido', 'NUM': 37, 'COLOR': 'Blue'}
>>> v = udict(u)
>>> v['title'] = 'BDFL'
>>> v
{'NAME': 'Guido', 'NUM': 37, 'COLOR': 'Blue', 'TITLE': 'BDFL'}
>>>

Most of the time, subclassing a built-in type can be avoided. For example, when building new containers, it is probably better to make a new class, as was shown for the Stack class in Section 7.8. If you really do need to subclass a built-in, it might be a lot more work than you think.

7.12 Class Variables and Methods

In a class definition, all functions are assumed to operate on an instance, which is always passed as the first parameter self. However, the class itself is also an object that can carry state and be manipulated as well. As an example, you could track how many instances have been created using a class variable num_accounts:

class Account:
    num_accounts = 0

    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance
        Account.num_accounts += 1

    def __repr__(self):
        return f'{type(self).__name__}({self.owner!r}, {self.balance!r})'

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.deposit(-amount)    # Must use self.deposit()

    def inquiry(self):
        return self.balance

Class variables are defined outside the normal __init__() method. To modify them, use the class, not self. For example:

>>> a = Account('Guido', 1000.0)
>>> b = Account('Eva', 10.0)
>>> Account.num_accounts
2
>>>

It’s a bit unusual, but class variables can also be accessed via instances. For example:

>>> a.num_accounts
2
>>> c = Account('Ben', 50.0)
>>> Account.num_accounts
3
>>> a.num_accounts
3
>>>

This works because attribute lookup on instances checks the associated class if there’s no matching attribute on the instance itself. This is the same mechanism by which Python normally finds methods.

It is also possible to define what’s known as a class method. A class method is a method applied to the class itself, not to instances. A common use of class methods is to define alternate instance constructors. For example, suppose there was a requirement to create Account instances from a legacy enterprise-grade input format:

data = '''
<account>
    <owner>Guido</owner>
    <amount>1000.0</amount>
</account>
'''

To do that, you can write a @classmethod like this:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    @classmethod
    def from_xml(cls, data):
        from xml.etree.ElementTree import XML
        doc = XML(data)
        return cls(doc.findtext('owner'),
                   float(doc.findtext('amount')))

# Example use

data = '''
<account>
    <owner>Guido</owner>
    <amount>1000.0</amount>
</account>
'''
a = Account.from_xml(data)

The first argument of a class method is always the class itself. By convention, this argument is often named cls. In this example, cls is set to Account. If the purpose of a class method is to create a new instance, explicit steps must be taken to do so. In the final line of the example, the call cls(..., ...) is the same as calling Account(..., ...) on the two arguments.

The fact that the class is passed as argument solves an important problem related to inheritance. Suppose you define a subclass of Account and now want to create an instance of that class. You’ll find that it still works:

class EvilAccount(Account):
    pass

e = EvilAccount.from_xml(data)    # Creates an 'EvilAccount'

The reason this code works is that EvilAccount is now passed as cls. Thus, the last statement of the from_xml() classmethod now creates an EvilAccount instance.

Class variables and class methods are sometimes used together to configure and control how instances operate. As another example, consider the following Date class:

import time

class Date:
    datefmt = '{year}-{month:02d}-{day:02d}'
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

    def __str__(self):
        return self.datefmt.format(year=self.year,
                                   month=self.month,
                                   day=self.day)

    @classmethod
    def from_timestamp(cls, ts):
        tm = time.localtime(ts)
        return cls(tm.tm_year, tm.tm_mon, tm.tm_mday)

    @classmethod
    def today(cls):
        return cls.from_timestamp(time.time())

This class features a class variable datefmt that adjusts output from the __str__() method. This is something that can be customized via inheritance:

class MDYDate(Date):
    datefmt = '{month}/{day}/{year}'

class DMYDate(Date):
    datefmt = '{day}/{month}/{year}'

# Example
a = Date(1967, 4, 9)
print(a)       # 1967-04-09

b = MDYDate(1967, 4, 9)
print(b)       # 4/9/1967

c = DMYDate(1967, 4, 9)
print(c)      # 9/4/1967

Configuration via class variables and inheritance like this is a common tool for adjusting the behavior of instances. The use of class methods is critical to making it work since they ensure that the proper kind of object gets created. For example:

a = MDYDate.today()
b = DMYDate.today()
print(a)      # 2/13/2019
print(b)      # 13/2/2019

Alternate construction of instances is, by far, the most common use of class methods. A common naming convention for such methods is to include the word from_ as a prefix, such as from_timestamp(). You will see this naming convention used in class methods throughout the standard library and in third-party packages. For example, dictionaries have a class method for creating a preinitialized dictionary from a set of keys:

>>> dict.from_keys(['a','b','c'], 0)
{'a': 0, 'b': 0, 'c': 0}
>>>

One caution about class methods is that Python does not manage them in a namespace separate from the instance methods. As a result, they can still be invoked on an instance. For example:

d = Date(1967,4,9)
b = d.today()          # Calls Date.now(Date)

This is potentially quite confusing because a call to d.today() doesn’t really have anything to do with the instance d. Yet, you might see today() listed as a valid method on Date instances in your IDE and in documentation.

7.13 Static Methods

Sometimes a class is merely used as a namespace for functions declared as static methods using @staticmethod. Unlike a normal method or class method, a static method does not take an extra self or cls argument. A static method is just a ordinary function that happens to be defined inside a class. For example:

class Ops:
    @staticmethod
    def add(x, y):
        return x + y

    @staticmethod
    def sub(x, y):
        return x - y

You don’t normally create instances of such a class. Instead, call the functions directly through the class:

a = Ops.add(2, 3)        # a = 5
b = Ops.sub(4, 5)        # a = -1

Sometimes other classes will use a collection of static methods like this to implement “swappable” or “configurable” behavior, or as something that loosely mimics the behavior of an import module. Consider the use of inheritance in the earlier Account example:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __repr__(self):
        return f'{type(self).__name__}({self.owner!r}, {self.balance!r})'

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.balance -= amount

    def inquiry(self):
        return self.balance

# A special "Evil" account
class EvilAccount(Account):
    def deposit(self, amount):
        self.balance += 0.95 * amount

    def inquiry(self):
        if random.randint(0,4) == 1:
           return 1.10 * self.balance
        else:
           return self.balance

The use of inheritance here is a little strange. It introduces two different kinds of objects, Account and EvilAccount. There is also no obvious way to change an existing Account instance into an EvilAccount or back, because this involves changing the instance type. Perhaps it’s better to have evil manifest as a kind of an account policy instead. Here is an alternate formulation of Account that does that with static methods:

class StandardPolicy:
    @staticmethod
    def deposit(account, amount):
        account.balance += amount

    @staticmethod
    def withdraw(account, amount):
        account.balance -= amount

    @staticmethod
    def inquiry(account):
        return account.balance

class EvilPolicy(StandardPolicy):
    @staticmethod
    def deposit(account, amount):
        account.balance += 0.95*amount

    @staticmethod
    def inquiry(account):
        if random.randint(0,4) == 1:
           return 1.10 * account.balance
        else:
           return account.balance

class Account:
    def __init__(self, owner, balance, *, policy=StandardPolicy):
        self.owner = owner
        self.balance = balance
        self.policy = policy

    def __repr__(self):
        return f'Account({self.policy}, {self.owner!r}, {self.balance!r})'

    def deposit(self, amount):
        self.policy.deposit(self, amount)

    def withdraw(self, amount):
        self.policy.withdraw(self, amount)

    def inquiry(self):
        return self.policy.inquiry(self)

In this reformulation, there is only one type of instance that gets created, Account. However, it has a special policy attribute that provides the implementation of various methods. If needed, the policy can be dynamically changed on an existing Account instance:

>>> a = Account('Guido', 1000.0)
>>> a.policy
<class 'StandardPolicy'>
>>> a.deposit(500)
>>> a.inquiry()
1500.0
>>> a.policy = EvilPolicy
>>> a.deposit(500)
>>> a.inquiry()      # Could randomly be 1.10x more
1975.0
>>>

One reason why @staticmethod makes sense here is that there is no need to create instances of StandardPolicy or EvilPolicy. The main purpose of these classes is to organize a bundle of methods, not to store additional instance data that’s related to Account. Still, the loosely coupled nature of Python could certainly allow a policy to be upgraded to hold its own data. Change the static methods to normal instance methods like this:

class EvilPolicy(StandardPolicy):
    def __init__(self, deposit_factor, inquiry_factor):
        self.deposit_factor = deposit_factor
        self.inquiry_factor = inquiry_factor

    def deposit(self, account, amount):
        account.balance += self.deposit_factor * amount

    def inquiry(self, account):
        if random.randint(0,4) == 1:
           return self.inquiry_factor * account.balance
        else:
           return account.balance

# Example use
a = Account('Guido', 1000.0, policy=EvilPolicy(0.95, 1.10))

This approach of delegating methods to supporting classes is a common implementation strategy for state machines and similar objects. Each operational state can be encapsulated into its own class of methods (often static). A mutable instance variable, such as the policy attribute in this example, can then be used to hold implementation-specific details related to the current operational state.

7.14 A Word about Design Patterns

In writing object-oriented programs, programmers sometimes get fixated on implementing named design patterns—such as the “strategy pattern,” the “flyweight pattern,” the “singleton pattern,” and so forth. Many of these originate from the famous Design Patterns book by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.

If you are familiar with such patterns, the general design principles used in other languages can certainly be applied to Python. However, many of these documented patterns are aimed at working around specific issues that arise from the strict static type system of C++ or Java. The dynamic nature of Python renders a lot of these patterns obsolete, an overkill, or simply unnecessary.

That said, there are a few overarching principles of writing good software—such as striving to write code that is debuggable, testable, and extensible. Basic strategies such as writing classes with useful __repr__() methods, preferring composition over inheritance, and allowing dependency injection can go a long way towards these goals. Python programmers also like to work with code that can be said to be “Pythonic.” Usually, that means that objects obey various built-in protocols, such as iteration, containers, or context management. For example, instead of trying to implement some exotic data traversal pattern from a Java programming book, a Python programmer would probably implement it with a generator function feeding a for loop, or just replace the entire pattern with a few dictionary lookups.

7.15 Data Encapsulation and Private Attributes

In Python, all attributes and methods of a class are public, that is, accessible without any restrictions. This is often undesirable in object-oriented applications that have reasons to hide or encapsulate internal implementation details.

To address this problem, Python relies on naming conventions as a means of signaling intended usage. One such convention is that names starting with a single leading underscore (_) indicate internal implementation. For example, here’s a version of the Account class where the balance has been turned into a “private” attribute:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self._balance = balance

    def __repr__(self):
        return f'Account({self.owner!r}, {self._balance!r})'

    def deposit(self, amount):
        self._balance += amount

    def withdraw(self, amount):
        self._balance -= amount

    def inquiry(self):
        return self._balance

In this code, the _balance attribute is meant to be an internal detail. There’s nothing that prevents a user from accessing it directly, but the leading underscore is a strong indicator that a user should look for a more public-facing interface—such as the Account.inquiry() method.

A grey area is whether or not internal attributes are available for use in a subclass. For example, is the previous inheritance example allowed to directly access the _balance attribute of its parent?

class EvilAccount(Account):
    def inquiry(self):
        if random.randint(0,4) == 1:
           return 1.10 * self._balance
        else:
           return self._balance

As a general rule, this is considered acceptable in Python. IDEs and other tools are likely to expose such attributes. If you come from C++, Java, or another similar object-oriented language, consider _balance similar to a “protected” attribute.

If you wish to have an even more private attribute, prefix the name with two leading underscores ( __ ). All names such as __name are automatically renamed into a new name of the form _Classname__name. This ensures that private names used in a superclass won’t be overwritten by identical names in a child class. Here’s an example that illustrates this behavior:

class A:
    def __init__(self):
        self.__x = 3        # Mangled to self._A__x

    def __spam(self):       # Mangled to _A__spam()
        print('A.__spam', self.__x)

    def bar(self):
        self.__spam()       # Only calls A.__spam()

class B(A):
    def __init__(self):
        A.__init__(self)
        self.__x = 37       # Mangled to self._B__x

    def __spam(self):       # Mangled to _B__spam()
        print('B.__spam', self.__x)

    def grok(self):
        self.__spam()       # Calls B.__spam()

In this example, there are two different assignments to an __x attribute. In addition, it appears that class B is trying to override the __spam() method via inheritance. Yet this is not the case. Name mangling causes unique names to be used for each definition. Try the following example:

>>> b = B()
>>> b.bar()
A.__spam 3
>>> b.grok()
B.__spam 37
>>>

You can see the mangled names more directly if you look at the underlying instance variables:

>>> vars(b)
{ '_A__x': 3, '_B__x': 37 }
>>> b._A__spam()
A.__spam 3
>>> b._B__spam
B.__spam 37
>>>

Although this scheme provides the illusion of data hiding, there’s no mechanism in place to actually prevent access to the “private” attributes of a class. In particular, if the names of the class and the corresponding private attribute are known, they can still be accessed using the mangled name. If such access to private attributes is still a concern, you might consider a more painful code review process.

At first glance, name mangling might look like an extra processing step. However, the mangling process actually only occurs once when the class is defined. It does not occur during execution of the methods, nor does it add extra overhead to program execution. Be aware that name mangling does not occur in functions such as getattr(), hasattr(), setattr(), or delattr() where the attribute name is specified as a string. For these functions, you would need to explicitly use the mangled name such as '_Classname__name' to access the attribute.

In practice, it’s probably best to not overthink the privacy of names. The single-underscore names are quite common; double-underscore names less so. Although you can take further steps to try and truly hide attributes, the extra effort and added complexity is hardly worth the benefits gained. Perhaps the most useful thing is to remember that if you see leading underscores on a name, it’s almost certainly some kind of internal detail best left alone.

7.16 Type Hinting

Attributes of user-defined classes have no constraints on their type or value. In fact, you can set an attribute to anything that you want. For example:

>>> a = Account('Guido', 1000.0)
>>> a.owner
'Guido'
>>> a.owner = 37
>>> a.owner
37
>>> b = Account('Eva', 'a lot')
>>> b.deposit(' more')
>>> b.inquiry()
'a lot more'
>>>

If this is a practical concern, there are a few possible solutions. One is easy—don’t do that! Another is to rely upon external tooling such as linters and type checkers. For this, classes allow optional type hints to be specified for selected attributes. For example:

class Account:
    owner: str            # Type hint
    _balance: float       # Type hint

    def __init__(self, owner, balance):
        self.owner = owner
        self._balance = balance
    ...

The inclusion of type hints changes nothing about the actual runtime behavior of a class—that is, no extra checking takes place and nothing prevents a user from setting bad values in their code. However, the hints might give users more useful information in their editor, thus preventing careless usage errors before they happen.

In practice, accurate type hinting can be difficult. For example, does the Account class allow someone to use an int instead of a float? Or what about a Decimal? You’ll find that all of these work even though the hint might seem to suggest otherwise.

from decimal import Decimal

a = Account('Guido', Decimal('1000.0'))
a.withdraw(Decimal('50.0'))
print(a.inquiry())      # -> 950.0

Knowing how to properly organize types in such situations is beyond the scope of this book. When in doubt, it’s probably better not to guess unless you are actively using tools that type-check your code.

7.17 Properties

As noted in the previous section, Python places no runtime restrictions on attribute values or types. However, such enforcement is possible if you put an attribute under the management of a so-called property. A property is a special kind of attribute that intercepts attribute access and handles it via user-defined methods. These methods have complete freedom to manage the attribute as they see fit. Here is an example:

import string
class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self._balance = balance

    @property
    def owner(self):
        return self._owner

    @owner.setter
    def owner(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected str')
        if not all(c in string.ascii_uppercase for c in value):
            raise ValueError('Must be uppercase ASCII')
        if len(value) > 10:
            raise ValueError('Must be 10 characters or less')
        self._owner = value

Here, the owner attribute is being constrained to a very enterprise-grade 10-character uppercase ASCII string. Here is how it works when you try to use the class:

>>> a = Account('GUIDO', 1000.0)
>>> a.owner = 'EVA'
>>> a.owner = 42
Traceback (most recent call last):
...
TypeError: Expected str
>>> a.owner = 'Carol'
Traceback (most recent call last):
...
ValueError: Must be uppercase ASCII
>>> a.owner = 'RENÉE'
Traceback (most recent call last):
...
ValueError: Must be uppercase ASCII
>>> a.owner = 'RAMAKRISHNAN'
Traceback (most recent call last):
...
ValueError: Must be 10 characters or less
>>>

The @property decorator is used to establish an attribute as a property. In this example, it’s being applied to the owner attribute. This decorator is always first applied to a method that gets the attribute value. In this case, the method is returning the actual value which is being stored in the private attribute _owner. The @owner.setter decorator that follows is used to optionally implement a method for setting the attribute value. This method performs the various type and value checks before storing the value in the private _owner attribute.

A critical feature of properties is that the associated name, such as owner in the example, becomes “magical.” That is, any use of that attribute automatically routes through the getter/setter methods that you implemented. You don’t have to change any preexisting code to make this work. For example, no changes need to be made to the Account.__init__() method. This may surprise you because __init__() makes the assignment self.owner = owner instead of using the private attribute self._owner. This is by design—the whole point of the owner property was to validate attribute values. You’d definitely want to do that when instances are created. You’ll find that it works exactly as intended:

>>> a = Account('Guido', 1000.0)
Traceback (most recent call last):
  File "account.py", line 5, in __init__
    self.owner = owner
  File "account.py", line 15, in owner
    raise ValueError('Must be uppercase ASCII')
ValueError: Must be uppercase ASCII
>>>

Since each access to a property attribute automatically invokes a method, the actual value needs to be stored under a different name. This is why _owner is used inside the getter and setter methods. You can’t use owner as the storage location because doing so would cause infinite recursion.

In general, properties allow for the interception of any specific attribute name. You can implement methods for getting, setting, or deleting the attribute value. For example:

class SomeClass:
    @property
    def attr(self):
        print('Getting')

    @attr.setter
    def attr(self, value):
        print('Setting', value)

    @attr.deleter
    def attr(self):
        print('Deleting')

# Example
s = SomeClass()
s.attr         # Getting
s.attr = 13    # Setting
del s.attr     # Deleting

It is not necessary to implement all parts of a property. In fact, it’s common to use properties for implementing read-only computed data attributes. For example:

class Box(object):
    def __init__(self, width, height):
        self.width = width
        self.height = height

    @property
    def area(self):
        return self.width * self.height

    @property
    def perimeter(self):
        return 2*self.width + 2*self.height

# Example use
b = Box(4, 5)
print(b.area)       # -> 20
print(b.perimeter)  # -> 18
b.area = 5          # Error: can't set attribute

One thing to think about when defining a class is making the programming interface to it as uniform as possible. Without properties, some values would be accessed as simple attributes such as b.width or b.height whereas other values would be accessed as methods such as b.area() and b.perimeter(). Keeping track of when to add the extra () creates unnecessary confusion. A property can help fix this.

Python programmers don’t often realize that methods themselves are implicitly handled as a kind of property. Consider this class:

class SomeClass:
    def yow(self):
        print('Yow!')

When a user creates an instance such as s = SomeClass() and then accesses s.yow, the original function object yow is not returned. Instead, you get a bound method like this:

>>> s = SomeClass()
>>> s.yow
<bound method SomeClass.yow of <__main__.SomeClass object at 0x10e2572b0>>
>>>

How did this happen? It turns out that functions behave a lot like properties when they’re placed in a class. Specifically, functions magically intercept attribute access and create the bound method behind the scenes. When you define static and class methods using @staticmethod and @classmethod, you are actually altering this process. @staticmethod returns the method function back “as is” without any special wrapping or processing. More information about this process is covered later in Section 7.28.

7.18 Types, Interfaces, and Abstract Base Classes

When you create an instance of a class, the type of that instance is the class itself. To test for membership in a class, use the built-in function isinstance(obj, cls). This function returns True if an object, obj, belongs to the class cls or any class derived from cls. Here’s an example:

class A:
    pass

class B(A):
    pass

class C:
    pass

a = A()           # Instance of 'A'
b = B()           # Instance of 'B'
c = C()           # Instance of 'C'

type(a)           # Returns the class object A
isinstance(a, A)  # Returns True
isinstance(b, A)  # Returns True, B derives from A
isinstance(b, C)  # Returns False, B not derived from C

Similarly, the built-in function issubclass(A, B) returns True if the class A is a subclass of class B. Here’s an example:

issubclass(B, A)   # Returns True
issubclass(C, A)   # Returns False

A common use of class typing relations is the specification of programming interfaces. As an example, a top-level base class might be implemented to specify the requirements of a programming interface. That base class might then be used for type hinting or for defensive type enforcement via isinstance():

class Stream:
    def receive(self):
        raise NotImplementedError()

    def send(self, msg):
        raise NotImplementedError()

    def close(self):
        raise NotImplementedError()

# Example.
def send_request(stream, request):
    if not isinstance(stream, Stream):
        raise TypeError('Expected a Stream')
    stream.send(request)
    return stream.receive()

The anticipation with such code is not that Stream be used directly. Instead, different classes would inherit from Stream and implement the required functionality. A user would instantiate one of those classes instead. For example:

class SocketStream(Stream):
    def receive(self):
        ...

    def send(self, msg):
        ...

    def close(self):
        ...

class PipeStream(Stream):
    def receive(self):
        ...

    def send(self, msg):
        ...

    def close(self):
        ...

# Example
s = SocketStream()
send_request(s, request)

Worth discussing in this example is the runtime enforcement of the interface in send_request(). Should one use a type hint instead?

# Specifying an interface as a type hint
def send_request(stream:Stream, request):
    stream.send(request)
    return stream.receive()

Given that type hints aren’t enforced, the decision of how to validate an argument against an interface really depends on when you want it to happen—at runtime, as a code checking step, or not at all.

This use of interface classes is more common in the organization of large frameworks and applications. However, with this approach you need to make sure that subclasses actually implement the required interface. For example, if a subclass chose not to implement one of the required methods or had a simple misspelling, the effects might go unnoticed at first as the code might still work in the common case. However, later on, the program will crash if the unimplemented method is invoked. Naturally, this would only take place at 3:30 AM in production.

To prevent this problem, it is common for interfaces to be defined as abstract base classes using the abc module. This module defines a base class (ABC) and a decorator (@abstractmethod) that are used together to describe an interface. Here is an example:

from abc import ABC, abstractmethod

class Stream(ABC):
    @abstractmethod
    def receive(self):
        pass

    @abstractmethod
    def send(self, msg):
        pass

    @abstractmethod
    def close(self):
        pass

An abstract class is not meant to be instantiated directly. In fact, if you try to create a Stream instance, you’ll get an error:

>>> s = Stream()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Stream with abstract methods close, receive, send
>>>

The error message tells you exactly what methods need to be implemented by a Stream. This serves as a guide for writing subclasses. Suppose you write a subclass but make a mistake:

class SocketStream(Stream):
    def read(self):         # Misnamed
        ...

    def send(self, msg):
        ...

    def close(self):
        ...

An abstract base class will catch the mistake upon instantiation. This is useful because errors are caught early.

>>> s = SocketStream()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class SocketStream with abstract methods receive
>>>

Although an abstract class cannot be instantiated, it can define methods and properties for use in subclasses. Moreover, an abstract method in the base can still be called from a subclass. For example, calling super().receive() from a subclass is allowed.

7.19 Multiple Inheritance, Interfaces, and Mixins

Python supports multiple inheritance. If a child class lists more than one parent, the child inherits all of the features of the parents. For example:

class Duck:
    def walk(self):
        print('Waddle')

class Trombonist:
    def noise(self):
        print('Blat!')

class DuckBonist(Duck, Trombonist):
    pass

d = DuckBonist()
d.walk()        # -> Waddle
d.noise()       # -> Blat!

Conceptually, it’s a neat idea, but then practical realities start to set in. For example, what happens if Duck and Trombonist each define an __init__() method? Or if they both define a noise() method? Suddenly, you start to realize that multiple inheritance is fraught with peril.

To better understand the actual usage of multiple inheritance, step back and view it as a highly specialized tool for organization and code reuse—as opposed to a general-purpose programming technique. Specifically, taking a collection of arbitrary unrelated classes and combining them together with multiple inheritance to create weird mutant duck-musicians isn’t standard practice. Don’t ever do that.

A more common use of multiple inheritance is organizing type and interface relations. For example, the last section introduced the concept of an abstract base class. The purpose of an abstract base is to specify a programming interface. For example, you might have various abstract classes like this:

from abc import ABC, abstractmethod

class Stream(ABC):
    @abstractmethod
    def receive(self):
        pass

    @abstractmethod
    def send(self, msg):
        pass

    @abstractmethod
    def close(self):
        pass

class Iterable(ABC):
    @abstractmethod
    def __iter__(self):
        pass

With these classes, multiple inheritance might be used to specify which interfaces have been implemented by a child class:

class MessageStream(Stream, Iterable):
    def receive(self):
        ...
    def send(self):
        ...
    def close(self):
        ...
    def __iter__(self):
        ...

Again, this use of multiple inheritance is not about implementation, but type relations. For example, none of the inherited methods even do anything in this example. There is no code reuse. Mainly, the inheritance relationship allows you to perform type checks like this:

m = MessageStream()

isinstance(m, Stream)     # -> True
isinstance(m, Iterable)   # -> True

The other use of multiple inheritance is to define mixin classes. A mixin class is a class that modifies or extends the functionality of other classes. Consider the following class definitions:

class Duck:
    def noise(self):
        return 'Quack'

    def waddle(self):
        return 'Waddle'

class Trombonist:
    def noise(self):
        return 'Blat!'

    def march(self):
        return 'Clomp'

class Cyclist:
    def noise(self):
        return 'On your left!'

    def pedal(self):
        return 'Pedaling'

These classes are completely unrelated to each other. There is no inheritance relationship and they implement different methods. However, there is a shared commonality in that they each define a noise() method. Using that as a guide, you could define the following classes:

class LoudMixin:
    def noise(self):
        return super().noise().upper()

class AnnoyingMixin:
    def noise(self):
        return 3*super().noise()

At first glance, these classes look wrong. There’s just a single isolated method and it uses super() to delegate to a nonexistent parent class. The classes don’t even work:

>>> a = AnnoyingMixin()
>>> a.noise()
Traceback (most recent call last):
...
AttributeError: 'super' object has no attribute 'noise'
>>>

These are mixin classes. The only way that they work is in combination with other classes that implement the missing functionality. For example:

class LoudDuck(LoudMixin, Duck):
    pass

class AnnoyingTrombonist(AnnoyingMixin, Trombonist):
    pass

class AnnoyingLoudCyclist(AnnoyingMixin, LoudMixin, Cyclist):
    pass

d = LoudDuck()
d.noise() # -> 'QUACK'

t = AnnoyingTrombonist()
t.noise() # -> 'Blat!Blat!Blat!'

c = AnnoyingLoudCyclist()
c.noise() # -> 'ON YOUR LEFT!ON YOUR LEFT!ON YOUR LEFT!'

Since mixin classes are defined in the same way as normal classes, it is best to include the word “Mixin” as part of the class name. This naming convention provides a greater clarity of purpose.

To fully understand mixins, you need to know a bit more about how inheritance and the super() function work.

First, whenever you use inheritance, Python builds a linear chain of classes known as the Method Resolution Order, or MRO for short. This is available as the __mro__ attribute on a class. Here are some examples for single inheritance:

class Base:
    pass

class A(Base):
    pass

class B(A):
    pass

Base.__mro__ # -> (<class 'Base'>, <class 'object'>)
A.__mro__    # -> (<class 'A'>, <class 'Base'>, <class 'object'>)
B.__mro__    # -> (<class 'B'>, <class 'A'>, <class 'Base'>, <class 'object'>)

The MRO specifies the search order for attribute lookup. Specifically, whenever you search for an attribute on an instance or class, each class on the MRO is checked in the order listed. The search stops when the first match is made. The object class is listed in the MRO because all classes inherit from object, whether or not it’s listed as a parent.

To support multiple inheritance, Python implements what’s known as “cooperative multiple inheritance.” With cooperative inheritance, all of the classes are placed on the MRO list according to two primary ordering rules. The first rule states that a child class must always be checked before any of its parents. The second rule states that if a class has multiple parents, those parents must be checked in the same order as they’re written in the inheritance list of the child. For the most part, these rules produce an MRO that makes sense. However, the precise algorithm that orders the classes is actually quite complex and not based on any simple approach such as depth-first or breadth-first search. Instead, the ordering is determined according to the C3 linearization algorithm, which is described in the paper “A Monotonic Superclass Linearization for Dylan” (K. Barrett, et al., presented at OOPSLA’96). A subtle aspect of this algorithm is that certain class hierarchies will be rejected by Python with a TypeError. Here’s an example:

class X: pass
class Y(X): pass
class Z(X,Y): pass  # TypeError.
                    # Can't create consistent MRO

In this case, the method resolution algorithm rejects class Z because it can’t determine an ordering of the base classes that makes sense. Here, the class X appears before class Y in the inheritance list, so it must be checked first. However, class Y inherits from X, so if X is checked first, it violates the rule about children being checked first. In practice, these issues rarely arise—and if they do, it usually indicates a more serious design problem.

As an example of an MRO in practice, here’s the MRO for the AnnoyingLoudCyclist class shown earlier:

class AnnoyingLoudCyclist(AnnoyingMixin, LoudMixin, Cyclist):
    pass

AnnoyingLoudCyclist.__mro__
# (<class 'AnnoyingLoudCyclist'>, <class 'AnnoyingMixin'>,
# <class 'LoudMixin'>, <class 'Cyclist'>, <class 'object'>)

In this MRO, you see how both rules are satisfied. Specifically, any child class is always listed before its parents. The object class is listed last because it is the parent of all other classes. The multiple parents are listed in the order they appeared in the code.

The behavior of the super() function is tied to the underlying MRO. Specifically, its role is to delegate attributes to the next class on the MRO. This is based upon the class where super() is used. For example, when the AnnoyingMixin class uses super(), it looks at the MRO of the instance to find its own position. From there, it delegates attribute lookup to the next class. In this example, using super().noise() in the AnnoyingMixin class invokes LoudMixin.noise(). This is because LoudMixin is the next class listed on the MRO for AnnoyingLoudCyclist. The super().noise() operation in the LoudMixin class then delegates to the Cyclist class. For any use of super(), the choice of the next class varies according to the type of the instance. For example, if you make an instance of AnnoyingTrombonist, then super().noise() will invoke Trombonist.noise() instead.

Designing for cooperative multiple inheritance and mixins is a challenge. Here are some design guidelines. First, child classes are always checked before any base class in the MRO. Thus, it is common for mixins to share a common parent, and for that parent to provide an empty implementation of methods. If multiple mixin classes are used at the same time, they’ll line up after each other. The common parent will appear last where it can provide a default implementation or an error check. For example:

class NoiseMixin:
    def noise(self):
        raise NotImplementedError('noise() not implemented')

class LoudMixin(NoiseMixin):
    def noise(self):
        return super().noise().upper()

class AnnoyingMixin(NoiseMixin):
    def noise(self):
        return 3 * super().noise()

The second guideline is that all implementations of a mixin method should have an identical function signature. One issue with mixins is that they are optional and often mixed together in unpredictable order. For this to work, you must guarantee that operations involving super() succeed regardless of what class comes next. To do that, all methods in the call chain need to have a compatible calling signature.

Finally, you need to make sure that you use super() everywhere. Sometimes you’ll encounter a class that makes a direct call to its parent:

class Base:
    def yow(self):
        print('Base.yow')

class A(Base):
    def yow(self):
        print('A.yow')
        Base.yow(self)       # Direct call to parent

class B(Base):
    def yow(self):
        print('B.yow')
        super().yow(self)

class C(A, B):
    pass

c = C()
c.yow()
# Outputs:
#    A.yow
#    Base.yow

Such classes are not safe to use with multiple inheritance. Doing so breaks the proper chain of method calls and causes confusion. For instance, in the above example, no output ever appears from B.yow() even though it’s part of the inheritance hierarchy. If you’re doing anything with multiple inheritance, you should be using super() instead of making direct calls to methods in superclasses.

7.20 Type-Based Dispatch

Sometimes you need to write code that dispatches based on a specific type. For example:

if isinstance(obj, Duck):
   handle_duck(obj)
elif isinstance(obj, Trombonist):
   handle_trombonist(obj)
elif isinstance(obj, Cyclist):
   handle_cyclist(obj)
else:
   raise RuntimeError('Unknown object')

Such a large if-elif-else block is inelegant and fragile. An often used solution is to dispatch through a dictionary:

handlers = {
    Duck: handle_duck,
    Trombonist: handle_trombonist,
    Cyclist: handle_cyclist
}

# Dispatch
def dispatch(obj):
    func = handlers.get(type(obj))
    if func:
         return func(obj)
    else:
         raise RuntimeError(f'No handler for {obj}')

This solution assumes an exact type match. If inheritance is also to be supported in such dispatch, you would need to walk the MRO:

def dispatch(obj):
    for ty in type(obj).__mro__:
        func = handlers.get(ty)
        if func:
            return func(obj)
    raise RuntimeError(f'No handler for {obj}')

Sometimes dispatching is implemented through a class-based interface using getattr() like this:

class Dispatcher:
    def handle(self, obj):
        for ty in type(obj).__mro__:
            meth = getattr(self, f'handle_{ty.__name__}', None)
            if meth:
                return meth(obj)
        raise RuntimeError(f'No handler for {obj}')

    def handle_Duck(self, obj):
        ...

    def handle_Trombonist(self, obj):
        ...

    def handle_Cyclist(self, obj):
        ...

# Example
dispatcher = Dispatcher()
dispatcher.handle(Duck())     # -> handle_Duck()
dispatcher.handle(Cyclist())  # -> handle_Cyclist()

This last example of using getattr() to dispatch onto methods of a class is a fairly common programming pattern.

7.21 Class Decorators

Sometimes you want to perform extra processing steps after a class has been defined—such as adding the class to a registry or generating extra support code. One approach is to use a class decorator. A class decorator is a function that takes a class as input and returns a class as output. For example, here’s how you can maintain a registry:

_registry = { }
def register_decoder(cls):
    for mt in cls.mimetypes:
        _registry[mt.mimetype] = cls
    return cls

# Factory function that uses the registry
def create_decoder(mimetype):
    return _registry[mimetype]()

In this example, the register_decoder() function looks inside a class for a mimetypes attribute. If found, it’s used to add the class to a dictionary mapping MIME types to class objects. To use this function, you apply it as a decorator right before the class definition:

@register_decoder
class TextDecoder:
    mimetypes = [ 'text/plain' ]
    def decode(self, data):
        ...

@register_decoder
class HTMLDecoder:
    mimetypes = [ 'text/html' ]
    def decode(self, data):
        ...

@register_decoder
class ImageDecoder:
    mimetypes = [ 'image/png', 'image/jpg', 'image/gif' ]
    def decode(self, data):
        ...

# Example usage
decoder = create_decoder('image/jpg')

A class decorator is free to modify the contents of the class it’s given. For example, it might even rewrite existing methods. This is a common alternative to mixin classes or multiple inheritance. For example, consider these decorators:

def loud(cls):
    orig_noise = cls.noise
    def noise(self):
        return orig_noise(self).upper()
    cls.noise = noise
    return cls

def annoying(cls):
    orig_noise = cls.noise
    def noise(self):
        return 3 * orig_noise(self)
    cls.noise = noise
    return cls

@annoying
@loud
class Cyclist(object):
    def noise(self):
        return 'On your left!'

    def pedal(self):
        return 'Pedaling'

This example produces the same result as the mixin example from the previous section. However, there is no multiple inheritance and no use of super(). Within each decorator, the lookup of cls.noise performs the same action as super(). But, since this only happens once when the decorator is applied (at definition time), the resulting calls to noise() will run a bit faster.

Class decorators can also be used to create entirely new code. For example, a common task when writing a class is to write a useful __repr__() method for improved debugging:

class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        return f'{type(self).__name__}({self.x!r}, {self.y!r})'

Writing such methods is often annoying. Perhaps a class decorator could create the method for you?

import inspect
def with_repr(cls):
    args = list(inspect.signature(cls).parameters)
    argvals = ', '.join('{self.%s!r}' % arg for arg in args)
    code = 'def __repr__(self):\n'
    code += f' return f"{cls.__name__}({argvals})"\n'
    locs = { }
    exec(code, locs)
    cls.__repr__ = locs['__repr__']
    return cls

# Example
@with_repr
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

In this example, a __repr__() method is generated from the calling signature of the __init__() method. The method is created as a text string and passed to exec() to create a function. That function is attached to the class.

Similar code generation techniques are used in parts of the standard library. For example, a convenient way to define data structures is to use a dataclass:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

A dataclass automatically creates methods such as __init__() and __repr__() from class type hints. The methods are created using exec(), similarly to the prior example. Here’s how the resulting Point class works:

>>> p = Point(2, 3)
>>> p
Point(x=2, y=3)
>>>

One downside of such an approach is poor startup performance. Dynamically creating code with exec() bypasses the compilation optimizations that Python normally applies to modules. Defining a large number of classes in this way may therefore significantly slow down the importing of your code.

The examples shown in this section illustrate common uses of class decorators— registration, code rewriting, code generation, validation, and so on. One issue with class decorators is that they must be explicitly applied to each class where they are used. This is not always desired. The next section describes a feature that allows for implicit manipulation of classes.

7.22 Supervised Inheritance

As you saw in the previous section, sometimes you want to define a class and perform additional actions. A class decorator is one mechanism for doing this. However, a parent class can also perform extra actions on behalf of its subclasses. This is accomplished by implementing an __init_subclass__(cls) class method. For example:

class Base:
    @classmethod
    def __init_subclass__(cls):
        print('Initializing', cls)

# Example (should see 'Initializing' message for each class)
class A(Base):
    pass

class B(A):
    pass

If an __init_subclass__() method is present, it is triggered automatically upon the definition of any child class. This happens even if the child is buried deeply in an inheritance hierarchy.

Many of the tasks commonly performed with class decorators can be performed with __init_subclass__() instead. For example, class registration:

class DecoderBase:
    _registry = { }
    @classmethod
    def __init_subclass__(cls):
        for mt in cls.mimetypes:
            DecoderBase._registry[mt.mimetype] = cls

# Factory function that uses the registry
def create_decoder(mimetype):
    return DecoderBase._registry[mimetype]()

class TextDecoder(DecoderBase):
    mimetypes = [ 'text/plain' ]
    def decode(self, data):
        ...

class HTMLDecoder(DecoderBase):
    mimetypes = [ 'text/html' ]
    def decode(self, data):
        ...

class ImageDecoder(DecoderBase):
    mimetypes = [ 'image/png', 'image/jpg', 'image/gif' ]
    def decode(self, data):
        ...

# Example usage
decoder = create_decoder('image/jpg')

Here is an example of a class that automatically creates a __repr__() method from the signature of the class __init__() method:

import inspect
class Base:
    @classmethod
    def __init_subclass__(cls):
        # Create a __repr__ method
        args = list(inspect.signature(cls).parameters)
        argvals = ', '.join('{self.%s!r}' % arg for arg in args)
        code = 'def __repr__(self):\n'
        code += f' return f"{cls.__name__}({argvals})"\n'
        locs = { }
        exec(code, locs)
        cls.__repr__ = locs['__repr__']

class Point(Base):
    def __init__(self, x, y):
        self.x = x
        self.y = y

If multiple inheritance is being used, you should use super() to make sure all classes that implement __init_subclass__() get called. For example:

class A:
    @classmethod
    def __init_subclass__(cls):
        print('A.init_subclass')
        super().__init_subclass__()

class B:
    @classmethod
    def __init_subclass__(cls):
        print('B.init_subclass')
        super().__init_subclass__()

# Should see output from both classes here
class C(A, B):
    pass

Supervising inheritance with __init_subclass__() is one of Python’s most powerful customization features. Much of its power comes from its implicit nature. A top-level base class can use this to quietly supervise an entire hierarchy of child classes. Such supervision can register classes, rewrite methods, perform validation, and more.

7.23 The Object Life Cycle and Memory Management

When a class is defined, the resulting class is a factory for creating new instances. For example:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

# Create some Account instances
a = Account('Guido', 1000.0)
b = Account('Eva', 25.0)

The creation of an instance is carried out in two steps using the special method __new__() that creates a new instance and __init__() that initializes it. For example, the operation a = Account('Guido', 1000.0) performs these steps:

a = Account.__new__(Account, 'Guido', 1000.0)
if isinstance(a, Account):
    Account.__init__('Guido', 1000.0)

Except for the first argument which is the class instead of an instance, __new__() normally receives the same arguments as __init__(). However, the default implementation of __new__() just ignores them. You’ll sometimes see __new__() invoked with just a single argument. For example, this code also works:

a = Account.__new__(Account)
Account.__init__('Guido', 1000.0)

Direct use of the __new__() method is uncommon, but sometimes it’s used to create instances while bypassing the invocation of the __init__() method. One such use is in class methods. For example:

import time

class Date:
    def __init__(self, year, month, day):
        self.year = year
        self.month = month
        self.day = day

    @classmethod
    def today(cls):
        t = time.localtime()
        self = cls.__new__(cls)   # Make instance
        self.year = t.tm_year
        self.month = t.tm_month
        self.day = t.tm_day
        return self

Modules that perform object serialization such as pickle also utilize __new__() to recreate instances when objects are deserialized. This is done without ever invoking __init__().

Sometimes a class will define __new__() if it wants to alter some aspect of instance creation. Typical applications include instance caching, singletons, and immutability. As an example, you might want Date class to perform date interning—that is, caching and reusing Date instances that have an identical year, month, and day. Here is one way that might be implemented:

class Date:
    _cache = { }

    @staticmethod
    def __new__(cls, year, month, day):
        self = Date._cache.get((year,month,day))
        if not self:
            self = super().__new__(cls)
            self.year = year
            self.month = month
            self.day = day
            Date._cache[year,month,day] = self
        return self

    def __init__(self, year, month, day):
        pass

# Example
d = Date(2012, 12, 21)
e = Date(2012, 12, 21)
assert d is e              # Same object

In this example, the class keeps an internal dictionary of previously created Date instances. When creating a new Date, the cache is consulted first. If a match is found, that instance is returned. Otherwise, a new instance is created and initialized.

A subtle detail of this solution is the empty __init__() method. Even though instances are cached, every call to Date() still invokes __init__(). To avoid duplicated effort, the method simply does nothing—instance creation actually takes place in __new__() when an instance is created the first time.

There are ways to avoid the extra call to __init__() but it requires sneaky tricks. One way to avoid it is to have __new__() return an entirely different type instance—for example, one belonging to a different class. Another solution, described later, is to use a metaclass.

Once created, instances are managed by reference counting. If the reference count reaches zero, the instance is immediately destroyed. When the instance is about to be destroyed, the interpreter first looks for a __del__() method associated with the object and calls it. For example:

class Account(object):
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __del__(self):
        print('Deleting Account')

>>> a = Account('Guido', 1000.0)
>>> del a
Deleting Account
>>>

Occasionally, a program will use the del statement to delete a reference to an object as shown. If this causes the reference count of the object to reach zero, the __del__() method is called. However, in general, the del statement doesn’t directly call __del__() because there may be other object references living elsewhere. There are many other ways that an object might be deleted—for example, reassignment of a variable name or a variable going out of scope in a function:

>>> a = Account('Guido', 1000.0)
>>> a = 42
Deleting Account
>>> def func():
...     a = Account('Guido', 1000.0)
...
>>> func()
Deleting Account
>>>

In practice, it’s rarely necessary for a class to define a __del__() method. The only exception is when the destruction of an object requires an extra cleanup action—such as closing a file, shutting down a network connection, or releasing other system resources. Even in these cases, it’s dangerous to rely on __del__() for a proper shutdown because there’s no guarantee that this method will be called when you think it would. For clean shutdown of resources, you should give the object an explicit close() method. You should also make your class support the context manager protocol so it can be used with the with statement. Here is an example that covers all of the cases:

class SomeClass:
    def __init__(self):
        self.resource = open_resource()

    def __del__(self):
        self.close()

    def close(self):
        self.resource.close()

    def __enter__(self):
        return self

    def __exit__(self, ty, val, tb):
        self.close()

# Closed via __del__()
s = SomeClass()
del s

# Explicit close
s = SomeClass()
s.close()

# Closed at the end of a context block
with SomeClass() as s:
    ...

Again, it should be emphasized that writing a __del__() in a class is almost never necessary. Python already has garbage collection and there is simply no need to do it unless there is some extra action that needs to take place upon object destruction. Even then, you still might not need __del__() as it’s possible that the object is already programmed to clean itself up properly even if you do nothing.

As if there weren’t enough dangers with reference counting and object destruction, there are certain kinds of programming patterns—especially those involving parent-child relationships, graphs, or caching—where objects can create a so-called reference cycle. Here is an example:

class SomeClass:
    def __del__(self):
        print('Deleting')

parent = SomeClass()
child = SomeClass()

# Create a child-parent reference cycle
parent.child = child
child.parent = parent

# Try deletion (no output from __del__ appears)
del parent
del child

In this example, the variable names are destroyed but you never see execution of the __del__() method. The two objects each hold internal references to each other, so there’s no way for the reference count to ever drop to 0. To handle this, a special cycle-detecting garbage collector runs every so often. Eventually the objects will be reclaimed, but it’s hard to predict when this might happen. If you want to force garbage collection, you can call gc.collect(). The gc module has a variety of other functions related to the cyclic garbage collector and monitoring memory.

Because of the unpredictable timing of garbage collection, the __del__() method has a few restrictions placed on it. First, any exception that propagates out of __del__() is printed to sys.stderr, but otherwise ignored. Second, the __del__() method should avoid operations such as acquiring locks or other resources. Doing so could result in a deadlock when __del__() is unexpectedly fired in the middle of executing an unrelated function within the seventh inner callback circle of signal handling and threads. If you must define __del__(), keep it simple.

7.24 Weak References

Sometimes objects are kept alive when you’d much rather see them die. In an earlier example, a Date class was shown with internal caching of instances. One problem with this implementation is that there is no way for an instance to ever be removed from the cache. As such, the cache will grow larger and larger over time.

One way to fix this problem is to create a weak reference using the weakref module. A weak reference is a way of creating a reference to an object without increasing its reference count. To work with a weak reference, you have to add an extra bit of code to check if the object being referred to still exists. Here’s an example of how you create a weakref:

>>> a = Account('Guido', 1000.0)
>>> import weakref
>>> a_ref = weakref.ref(a)
>>> a_ref
<weakref at 0x104617188; to 'Account' at 0x1046105c0>
>>>

Unlike a normal reference, a weak reference allows the original object to die. For example:

>>> del a
>>> a_ref
<weakref at 0x104617188; dead>
>>>

A weak reference contains an optional reference to an object. To get the actual object, you need to call the weak reference as a function with no arguments. This will either return the object being pointed at or None. For example:

acct = a_ref()
if acct is not None:
    acct.withdraw(10)

# Alternative
if acct := a_ref():
   acct.withdraw(10)

Weak references are commonly used in conjunction with caching and other advanced memory management. Here is a modified version of the Date class that automatically removes objects from the cache when no more references exist:

import weakref

class Date:
    _cache = { }

@staticmethod
    def __new__(cls, year, month, day):
        selfref = Date._cache.get((year,month,day))
        if not selfref:
            self = super().__new__(cls)
            self.year = year
            self.month = month
            self.day = day
            Date._cache[year,month,day] = weakref.ref(self)
        else:
            self = selfref()
        return self

    def __init__(self, year, month, day):
        pass

    def __del__(self):
        del Date._cache[self.year,self.month,self.day]

This might require a bit of study, but here is an interactive session that shows how it works. Notice how an entry is removed from the cache once no more references to it exist:

>>> Date._cache
{}
>>> a = Date(2012, 12, 21)
>>> Date._cache
{(2012, 12, 21): <weakref at 0x10c7ee2c8; to 'Date' at 0x10c805518>}
>>> b = Date(2012, 12, 21)
>>> a is b
True
>>> del a
>>> Date._cache
{(2012, 12, 21): <weakref at 0x10c7ee2c8; to 'Date' at 0x10c805518>}
>>> del b
>>> Date._cache
{}
>>>

As previously noted, the __del__() method of a class is only invoked when the reference count of an object reaches zero. In this example, the first del a statement decreases the reference count. However, since there’s still another reference to the same object, the object remains in Date._cache. When the second object is deleted, __del__() is invoked and the cache is cleared.

Support for weak references requires instances to have a mutable __weakref__ attribute. Instances of user-defined classes normally have such an attribute by default. However, built-in types and certain kinds of special data structures—named tuples, classes with slots—do not. If you want to construct weak references to these types, you can do it by defining variants with a __weakref__ attribute added:

class wdict(dict):
    __slots__ = ('__weakref__',)

w = wdict()
w_ref = weakref.ref(w)      # Now works

The use of slots here is to avoid unnecessary memory overhead, as explained shortly.

7.25 Internal Object Representation and Attribute Binding

The state associated with an instance is stored in a dictionary that’s accessible as the instance’s __dict__ attribute. This dictionary contains the data that’s unique to each instance. Here’s an example:

>>> a = Account('Guido', 1100.0)
>>> a.__dict__
{'owner': 'Guido', 'balance': 1100.0}

New attributes can be added to an instance at any time:

a.number = 123456 # Add attribute 'number' to a.__dict__
a.__dict__['number'] = 654321

Modifications to an instance are always reflected in the local __dict__ attribute unless the attribute is being managed by a property. Likewise, if you make modifications to __dict__ directly, those modifications are reflected in the attributes.

Instances are linked back to their class by a special attribute __class__. The class itself is also just a thin layer over a dictionary that can be found in its own __dict__ attribute. The class dictionary is where you find the methods. For example:

>>> a.__class__
<class '__main__.Account'>
>>> Account.__dict__.keys()
dict_keys(['__module__', '__init__', '__repr__', 'deposit', 'withdraw',
'inquiry', '__dict__', '__weakref__', '__doc__'])
>>> Account.__dict__['withdraw']
<function Account.withdraw at 0x108204158>
>>>

Classes are linked to their base classes by a special attribute __bases__, which is a tuple of the base classes. The __bases__ attribute is only informational. The actual runtime implementation of inheritance uses the __mro__ attribute which is a tuple of all parent classes listed in search order. This underlying structure is the basis for all operations that get, set, or delete the attributes of instances.

Whenever an attribute is set using obj.name = value, the special method obj.__setattr__('name', value) is invoked. If an attribute is deleted using del obj.name, the special method obj.__delattr__('name') is invoked. The default behavior of these methods is to modify or remove values from the local __dict__ of obj unless the requested attribute happens to correspond to a property or descriptor. In that case, the set and delete operations will be carried out by the set and delete functions associated with the property.

For attribute lookup such as obj.name, the special method obj.__getattribute__('name') is invoked. This method carries out the search for the attribute, which normally includes checking the properties, looking in the local __dict__, checking the class dictionary, and searching the MRO. If this search fails, a final attempt to find the attribute is made by invoking the obj.__getattr__('name') method of the class (if defined). If this fails, an AttributeError exception is raised.

User-defined classes can implement their own versions of the attribute access functions, if desired. For example, here’s a class that restricts the attribute names that can be set:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def __setattr__(self, name, value):
        if name not in {'owner', 'balance'}:
            raise AttributeError(f'No attribute {name}')
        super().__setattr__(name, value)

# Example
a = Account('Guido', 1000.0)
a.balance = 940.25          # Ok
a.amount = 540.2            # AttributeError. No attribute amount

A class that reimplements these methods should rely upon the default implementation provided by super() to carry out the actual work of manipulating an attribute. This is because the default implementation takes care of the more advanced features of classes such as descriptors and properties. If you don’t use super(), you will have to take care of these details yourself.

7.26 Proxies, Wrappers, and Delegation

Sometimes classes implement a wrapper layer around another object to create a kind of proxy object. A proxy is an object exposes the same interface as another object but, for some reason, isn’t related to the original object via inheritance. This is different from composition where an entirely new object is created from other objects but with its own unique set of methods and attributes.

There are many real-world scenarios where this might arise. For example, in distributed computing, the actual implementation of an object might live on a remote server in the cloud. Clients that interact with that server might use a proxy that looks like the object on the server but, behind the scenes, delegates all of its method calls via network messages.

A common implementation technique for proxies involves the __getattr__() method. Here is a simple example:

class A:
    def spam(self):
        print('A.spam')

    def grok(self):
        print('A.grok')

    def yow(self):
        print('A.yow')

class LoggedA:
    def __init__(self):
        self._a = A()

    def __getattr__(self, name):
        print("Accessing", name)
        # Delegate to internal A instance
        return getattr(self._a, name)

# Example use
a = LoggedA()
a.spam()       # prints "Accessing spam" and "A.spam"
a.yow()        # prints "Accessing yow" and "A.yow"

Delegation is sometimes used as an alternative to inheritance. Here is an example:

class A:
    def spam(self):
        print('A.spam')

    def grok(self):
        print('A.grok')

    def yow(self):
        print('A.yow')

class B:
    def __init__(self):
        self._a = A()

    def grok(self):
        print('B.grok')

    def __getattr__(self, name):
        return getattr(self._a, name)

# Example use
b = B()
b.spam()      # -> A.spam
b.grok()      # -> B.grok   (redefined method)
b.yow()       # -> A.yow

In this example, it appears as if class B might be inheriting from class A and redefining a single method. This is the observed behavior—but inheritance is not being used. Instead, B holds an internal reference to an A inside. Certain methods of A can be redefined. However, all of the other methods are delegated via the __getattr__() method.

The technique of forwarding attribute lookup via __getattr__() is a common technique. However, be aware that it does not apply to operations mapped to special methods. For example, consider this class:

class ListLike:
    def __init__(self):
        self._items = list()

    def __getattr__(self, name):
        return getattr(self._items, name)

# Example
a = ListLike()
a.append(1)         # Works
a.insert(0, 2)      # Works
a.sort()            # Works

len(a)              # Fails. No __len__() method
a[0]                # Fails. No __getitem__() method

Here, the class successfully forwards all of the standard list methods (list.sort(), list.append(), and so on) to an inner list. However, none of Python’s standard operators work. To make those work, you would have to explicitly implement the required special methods. For example:

class ListLike:
    def __init__(self):
        self._items = list()

    def __getattr__(self, name):
        return getattr(self._items, name)

    def __len__(self):
        return len(self._items)

    def __getitem__(self, index):
        return self._items[index]

    def __setitem__(self, index, value):
        self._items[index] = value

7.27 Reducing Memory Use with __slots__

As we’ve seen, an instance stores its data in a dictionary. If you are creating a large number of instances, this can introduce a lot of memory overhead. If you know that the attribute names are fixed, you can specify the names in a special class variable called __slots__. Here’s an example:

class Account(object):
    __slots__ = ('owner', 'balance')
      ...

Slots is a definition hint that allows Python to make performance optimizations for both memory use and execution speed. Instances of a class with __slots__ no longer use a dictionary for storing instance data. Instead, a much more compact data structure based on an array is used. In programs that create a large number of objects, using __slots__ can result in a substantial reduction in memory use and a modest improvement in execution time.

The only entries in __slots__ are instance attributes. You do not list methods, properties, class variables, or any other class-level attributes. Basically, it’s the same names that would ordinarily appear as dictionary keys in the instance’s __dict__.

Be aware that __slots__ has a tricky interaction with inheritance. If a class inherits from a base class that uses __slots__, it also needs to define __slots__ for storing its own attributes (even if it doesn’t add any) to take advantage of the benefits __slots__ provides. If you forget this, the derived class will run slower—and use even more memory than if __slots__ had not been used on any of the classes!

__slots__ is incompatible with multiple inheritance. If multiple base classes are specified, each with nonempty slots, you will get a TypeError.

The use of __slots__ can also break code that expects instances to have an underlying __dict__ attribute. Although this often does not apply to user code, utility libraries and other tools for supporting objects may be programmed to look at __dict__ for debugging, serializing objects, and other operations.

The presence of __slots__ has no effect on the invocation of methods such as __getattribute__(), __getattr__(), and __setattr__() should they be redefined in a class. However, if you’re implementing such methods, be aware that there is no longer any instance __dict__ attribute. Your implementation will need to take that into account.

7.28 Descriptors

Normally, attribute access corresponds to dictionary operations. If more control is needed, attribute access can be routed through user-defined get, set, and delete functions. The use of properties was already described. However, a property is actually implemented using a lower-level construct known as a descriptor. A descriptor is a class-level object that manages access to an attribute. By implementing one or more of the special methods __get__(), __set__(), and __delete__(), you can hook directly into the attribute access mechanism and customize those operations. Here is an example:

class Typed:
    expected_type = object

    def __set_name__(self, cls, name):
        self.key = name

    def __get__(self, instance, cls):
        if instance:
            return instance.__dict__[self.key]
        else:
            return self

    def __set__(self, instance, value):
        if not isinstance(value, self.expected_type):
            raise TypeError(f'Expected {self.expected_type}')
        instance.__dict__[self.key] = value

    def __delete__(self,instance):
        raise AttributeError("Can't delete attribute")

class Integer(Typed):
    expected_type = int

class Float(Typed):
    expected_type = float

class String(Typed):
    expected_type = str

# Example use:
class Account:
    owner = String()
    balance = Float()

    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

In this example, the class Typed defines a descriptor where type checking is performed when an attribute is assigned and an error is produced if an attempt is made to delete the attribute. The Integer, Float, and String subclasses specialize Type to match a specific type. The use of these classes in another class (such as Account) makes those attributes automatically call the appropriate __get__(), __set__(), or __delete__() methods on access. For example:

a = Account('Guido', 1000.0)
b = a.owner           # Calls Account.owner.__get__(a, Account)
a.owner = 'Eva'       # Calls Account.owner.__set__(a, 'Eva')
del f.owner           # Calls Account.owner.__delete__(a)

Descriptors can only be instantiated at the class level. It is not legal to create descriptors on a per-instance basis by creating descriptor objects inside __init__() and other methods. The __set_name__() method of a descriptor is invoked after a class has been defined, but before any instances have been created, to inform a descriptor about the name that has been used within the class. For instance, the balance = Float() definition calls Float.__set_name__(Account, 'balance') to inform the descriptor of the class and name being used.

Descriptors with a __set__() method always take precedence over items in the instance dictionary. For example, if a descriptor happens to have the same name as a key in the instance dictionary, the descriptor takes priority. In the above Account example, you’ll see the descriptor applying type checking even though the instance dictionary has a matching entry:

>>> a = Account('Guido', 1000.0)
>>> a.__dict__
{'owner': 'Guido', 'balance': 1000.0 }
>>> a.balance = 'a lot'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "descrip.py", line 63, in __set__
    raise TypeError(f'Expected {self.expected_type}')
TypeError: Expected <class 'float'>
>>>

The __get__(instance, cls) method of a descriptor takes arguments for both the instance and the class. It’s possible that __get__() is invoked at the class level, in which case the instance argument is None. In most cases, the __get__() returns the descriptor back if no instance is provided. For example:

>>> Account.balance
<__main__.Float object at 0x110606710>
>>>

A descriptor that only implements __get__() is known as a method descriptor. It has a weaker binding than a descriptor with both get/set capabilities. Specifically, the __get__() method of a method descriptor only gets invoked if there is no matching entry in the instance dictionary. The reason it’s called a method descriptor is that this kind of descriptor is mainly used to implement Python’s various types of methods—including instance methods, class methods, and static methods.

For example, here’s a skeleton implementation that shows how @classmethod and @staticmethod could be implemented from scratch (the real implementation is more efficient):

import types
class classmethod:
    def __init__(self, func):
        self.__func__ = func

    # Return a bound method with cls as first argument
    def __get__(self, instance, cls):
        return types.MethodType(self.__func__, cls)

class staticmethod:
    def __init__(self, func):
        self.__func__ = func

    # Return the bare function
    def __get__(self, instance, cls):
        return self.__func__

Since method descriptors only act if there is no matching entry in the instance dictionary, they can also be used to implement various forms of lazy evaluation of attributes. For example:

class Lazy:
    def __init__(self, func):
        self.func = func

    def __set_name__(self, cls, name):
        self.key = name

    def __get__(self, instance, cls):
        if instance:
            value = self.func(instance)
            instance.__dict__[self.key] = value
            return value
        else:
            return self

class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    area = Lazy(lambda self: self.width * self.height)
    perimeter = Lazy(lambda self: 2*self.width + 2*self.height)

In this example, area and perimeter are attributes that are computed on demand and stored in the instance dictionary. Once computed, values are just returned directly from the instance dictionary.

>>> r = Rectangle(3, 4)
>>> r.__dict__
{'width': 3, 'height': 4 }
>>> r.area
12
>>> r.perimeter
14
>>> r.__dict__
{'width': 3, 'height': 4, 'area': 12, 'perimeter': 14 }
>>>

7.29 Class Definition Process

The definition of a class is a dynamic process. When you define a class using the class statement, a new dictionary is created that serves as the local class namespace. The body of the class then executes as a script within this namespace. Eventually, the namespace becomes the __dict__ attribute of the resulting class object.

Any legal Python statement is allowed in the body of a class. Normally, you just define functions and variables, but control flow, imports, nested classes, and everything else is allowed. For example, here is a class that conditionally defines methods:

debug = True

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    if debug:
        import logging
        log = logging.getLogger(f'{__module__}.{__qualname__}')
        def deposit(self, amount):
            Account.log.debug('Depositing %f', amount)
            self.balance += amount

        def withdraw(self, amount):
            Account.log.debug('Withdrawing %f', amount)
            self.balance -= amount
    else:
        def deposit(self, amount):
            self.balance += amount

        def withdraw(self, amount):
            self.balance -= amount

In this example, a global variable debug is being used to conditionally define methods. The __qualname__ and __module__ variables are predefined strings that hold information about the class name and enclosing module. These can be used by statements in the class body. In this example, they’re being used to configure the logging system. There are probably cleaner ways of organizing the above code, but the key point is that you can put anything you want in a class.

One critical point about class definition is that the namespace used to hold the contents of the class body is not a scope of variables. Any name that gets used within a method (such as Account.log in the above example) needs to be fully qualified.

If a function like locals() is used in a class body (but not inside a method), it returns the dictionary being used for the class namespace.

7.30 Dynamic Class Creation

Normally, classes are created using the class statement, but this is not a requirement. As noted in the previous section, classes are defined by executing the body of a class to populate a namespace. If you’re able to populate a dictionary with your own definitions, you can make a class without ever using the class statement. To do that, use types.new_class():

import types

# Some methods (not in a class)
def __init__(self, owner, balance):
    self.owner = owner
    self.balance = balance

def deposit(self, amount):
    self.balance -= amount

def withdraw(self, amount):
    self.balance += amount

methods = {
   '__init__': __init__,
   'deposit': deposit,
   'withdraw': withdraw,
}

Account = types.new_class('Account', (),
               exec_body=lambda ns: ns.update(methods))

# You now have a class
a = Account('Guido', 1000.0)
a.deposit(50)
a.withdraw(25)

The new_class() function requires a class name, a tuple of base classes, and a callback function responsible for populating the class namespace. This callback receives the class namespace dictionary as an argument. It should update this dictionary in place. The return value of the callback is ignored.

Dynamic class creation may be useful if you want to create classes from data structures. For example, in the section on descriptors, the following classes were defined:

class Integer(Typed):
    expected_type = int

class Float(Typed):
    expected_type = float

class String(Typed):
    expected_type = str

This code is highly repetitive. Perhaps a data-driven approach would be better:

typed_classes = [
   ('Integer', int),
   ('Float', float),
   ('String', str),
   ('Bool', bool),
   ('Tuple', tuple),
]

globals().update(
   (name, types.new_class(name, (Typed,),
          exec_body=lambda ns: ns.update(expected_type=ty)))
   for name, ty in typed_classes)

In this example, the global module namespace is being updated with dynamically created classes using types.new_class(). If you want to make more classes, put an appropriate entry in the typed_classes list.

Sometimes you will see type() being used to dynamically create a class instead. For example:

Account = type('Account', (), methods)

This works, but it doesn’t take into account some of the more advanced class machinery such as metaclasses (to be discussed shortly). In modern code, try to use types.new_class() instead.

7.31 Metaclasses

When you define a class in Python, the class definition itself becomes an object. Here’s an example:

class Account:
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.balance -= amount

isinstance(Account, object)       # -> True

If you think about this long enough, you will realize that if Account is an object, then something had to create it. This creation of the class object is controlled by a special kind of class called a metaclass. Simply put, a metaclass is a class that creates instances of classes.

In the preceding example, the metaclass that created Account is a built-in class called type. In fact, if you check the type of Account, you will see that it is an instance of type:

>>> Account.__class__
<type 'type'>
>>>

It’s a bit brain-bending, but it’s similar to integers. For example, if you write x = 42 and then look at x.__class__, you’ll get int, which is the class that creates integers. Similarly, type makes instances of types or classes.

When a new class is defined with the class statement, a number of things happen. First, a new namespace for the class is created. Next, the body of the class is executed in this namespace. Finally, the class name, base classes, and populated namespace are used to create the class instance. The following code illustrates the low-level steps that take place:

# Step 1: Create the class namespace
namespace = type.__prepare__('Account', ())

# Step 2: Execute the class body
exec('''
def __init__(self, owner, balance):
    self.owner = owner
    self.balance = balance

def deposit(self, amount):
    self.balance += amount

def withdraw(self, amount):
    self.balance -= amount
''', globals(), namespace)

# Step 3: Create the final class object
Account = type('Account', (), namespace)

In the definition process, there is interaction with the type class to create the class namespace and to create the final class object. The choice of using type can be customized—a class can choose to be processed by a different type class by specifying a different metaclass. This is done by using the metaclass keyword argument in inheritance:

class Account(metaclass=type):
    ...

If no metaclass is given, the class statement examines the type of the first entry in the tuple of base classes (if any) and uses that as the metaclass. Therefore, if you write class Account(object), the resulting Account class will have the same type as object (which is type). Note that classes that don’t specify any parent at all always inherit from object, so this still applies.

To create a new metaclass, define a class that inherits from type. Within this class, you can redefine one or more methods that are used during the class creation process. Typically, this includes the __prepare__() method used to create the class namespace, the __new__() method used to create the class instance, the __init__() method called after a class has already been created, and the __call__() method used to create new instances. The following example implements a metaclass that merely prints the input arguments to each method so you can experiment:

class mytype(type):

    # Creates the class namespace
    @classmethod
    def __prepare__(meta, clsname, bases):
        print("Preparing:", clsname, bases)
        return super().__prepare__(clsname, bases)

    # Creates the class instance after body has executed
    @staticmethod
    def __new__(meta, clsname, bases, namespace):
        print("Creating:", clsname, bases, namespace)
        return super().__new__(meta, clsname, bases, namespace)

    # Initializes the class instance
    def __init__(cls, clsname, bases, namespace):
        print("Initializing:", clsname, bases, namespace)
        super().__init__(clsname, bases, namespace)

    # Creates new instances of the class
    def __call__(cls, *args, **kwargs):
        print("Creating instance:", args, kwargs)
        return super().__call__(*args, **kwargs)

# Example
class Base(metaclass=mytype):
    pass

# Definition of the Base produces the following output
# Preparing: Base ()
# Creating: Base () {'__module__': '__main__', '__qualname__': 'Base'}
# Initializing: Base () {'__module__': '__main__', '__qualname__': 'Base'}

b = Base()
# Creating instance: () {}

One tricky facet of working with metaclasses is the naming of variables and keeping track of the various entities involved. In the above code, the meta name refers to the metaclass itself. The cls name refers to a class instance created by the metaclass. Although not used here, the self name refers to a normal instance created by a class.

Metaclasses propagate via inheritance. So, if you’ve defined a base class to use a different metaclass, all child classes will also use that metaclass. Try this example to see your custom metaclass at work:

class Account(Base):
    def __init__(self, owner, balance):
        self.owner = owner
        self.balance = balance

    def deposit(self, amount):
        self.balance += amount

    def withdraw(self, amount):
        self.balance -= amount

print(type(Account))   # -> <class 'mytype'>

The primary use of metaclasses is in situations where you want to exert extreme low-level control over the class definition environment and creation process. Before proceeding, however, remember that Python already provides a lot of functionality for monitoring and altering class definitions (such as the __init_subclass__() method, class decorators, descriptors, mixins, and so on). Most of the time, you probably don’t need a metaclass. That said, the next few examples show situations where a metaclass provides the only sensible solution.

One use of a metaclass is in rewriting the contents of the class namespace prior to the creation of the class object. Certain features of classes are established at definition time and can’t be modified later. One such feature is __slots__. As noted earlier, __slots__ is a performance optimization related to the memory layout of instances. Here’s a metaclass that automatically sets the __slots__ attribute based on the calling signature of the __init__() method.

import inspect

class SlotMeta(type):
    @staticmethod
    def __new__(meta, clsname, bases, methods):
        if '__init__' in methods:
            sig = inspect.signature(methods['__init__'])
            __slots__ = tuple(sig.parameters)[1:]
        else:
            __slots__ = ()
        methods['__slots__'] = __slots__
        return super().__new__(meta, clsname, bases, methods)

class Base(metaclass=SlotMeta):
    pass

# Example
class Point(Base):
    def __init__(self, x, y):
        self.x = x
        self.y = y

In this example, the Point class that’s created is automatically created with __slots__ of ('x', 'y'). The resulting instances of Point now get memory savings without knowing that slots are being used. It doesn’t have to be specified directly. This kind of trick is not possible with class decorators or with __init_subclass__() because those features only operate on a class after it’s been created. By then, it’s too late to apply the __slots__ optimization.

Another use of metaclasses is for altering the class definition environment. For example, duplicate definitions of a name during class definition normally result in a silent error—the second definition overwrites the first. Suppose you wanted to catch that. Here’s a metaclass that does that by defining a different kind of dictionary for the class namespace:

class NoDupeDict(dict):
    def __setitem__(self, key, value):
        if key in self:
            raise AttributeError(f'{key} already defined')
        super().__setitem__(key, value)

class NoDupeMeta(type):
    @classmethod
    def __prepare__(meta, clsname, bases):
        return NoDupeDict()

class Base(metaclass=NoDupeMeta):
    pass

# Example
class SomeClass(Base):
    def yow(self):
        print('Yow!')

    def yow(self, x):             # Fails. Already defined
        print('Different Yow!')

This is only a small sample of what’s possible. For framework builders, metaclasses offer an opportunity to tightly control what happens during class definition—allowing classes to serve as a kind of domain-specific language.

Historically, metaclasses have been used to accomplish a variety of tasks that are now possible through other means. The __init_subclass__() method, in particular, can be used to address a wide variety of use cases where metaclasses were once applied. This includes registration of classes with a central registry, automatic decoration of methods, and code generation.

7.32 Built-in Objects for Instances and Classes

This section gives some details about the low-level objects used to represent types and instances. This information may be useful in low-level metaprogramming and code that needs to directly manipulate types.

Table 7.1 shows commonly used attributes of a type object cls.

Table 7.1 Attributes of Types

Attribute

Description

cls.__name__

Class name

cls.__module__

Module name in which the class is defined

cls.__qualname__

Fully qualified class name

cls.__bases__

Tuple of base classes

cls.__mro__

Method Resolution Order tuple

cls.__dict__

Dictionary holding class methods and variables

cls.__doc__

Documentation string

cls.__annotations__

Dictionary of class type hints

cls.__abstractmethods__

Set of abstract method names (may be undefined if there aren’t any).

The cls.__name__ attribute contains a short class name. The cls.__qualname__ attribute contains a fully qualified name with additional information about the surrounding context (this may be useful if a class is defined inside a function or if you create a nested class definition). The cls.__annotations__ dictionary contains class-level type hints (if any).

Table 7.2 shows special attributes of an instance i.

Table 7.2 Instance Attributes

Attribute

Description

i.__class__

Class to which the instance belongs

i.__dict__

Dictionary holding instance data (if defined)

The __dict__ attribute is normally where all of the data associated with an instance is stored. However, if a user-defined class uses __slots__, a more efficient internal representation is used and instances will not have a __dict__ attribute.

7.33 Final Words: Keep It Simple

This chapter has presented a lot of information about classes and the ways to customize and control them. However, when writing classes, keeping it simple is often the best strategy. Yes, you could use abstract based classes, metaclasses, descriptors, class decorators, properties, multiple inheritance, mixins, patterns, and type hints. However, you could also just write a plain class. There’s a pretty good chance that this class is good enough and that everyone will understand what it’s doing.

In the big picture, it’s useful to step back and consider a few generally desirable code qualities. First and foremost, readability counts for a lot—and it often suffers if you pile on too many layers of abstraction. Second, you should try to make code that is easy to observe and debug, and don’t forget about using the REPL. Finally, making code testable is often a good driver for good design. If your code can’t be tested or testing is too awkward, there may be a better way to organize your solution.