In this chapter, we go beyond the basics of using functions. I’ll assume you can write functions with default argument values:
>>>deffoo(a,b,x=3,y=2):...return(a+b)/(x+y)...>>>foo(5,0)1.0>>>foo(10,2,y=3)2.0>>>foo(b=4,x=8,a=1)0.5
Notice the way foo() is called the last time, with arguments out
of order, and everything specified by key-value pairs. Not everyone
knows that you can call most Python functions this way. So long as the
value of each argument is unambiguously specified, Python doesn’t care
how you call the function (and this case, we specify b, x, and a
out of order, letting y be its default value). We will leverage this
flexibility later.
This chapter’s topics are useful and valuable on their own. And they are important building blocks for some extremely powerful patterns, which you will learn in later chapters. Let’s get started!
The foo() function above can be called with two, three, or four arguments.
Sometimes you want to define a function that can take any number of
arguments—zero or more, in other words. In Python, it looks like
this:
# Note the asterisk. That's the magic partdeftakes_any_args(*args):("Type of args: "+str(type(args)))("Value of args: "+str(args))
Look carefully at the syntax here. takes_any_args() is just like a
regular function, except you put an asterisk right before the argument
args. Within the function, args is a tuple:
>>>takes_any_args("x","y","z")Type of args: <class 'tuple'>Value of args: ('x', 'y', 'z')>>>takes_any_args(1)Type of args: <class 'tuple'>Value of args: (1,)>>>takes_any_args()Type of args: <class 'tuple'>Value of args: ()>>>takes_any_args(5,4,3,2,1)Type of args: <class 'tuple'>Value of args: (5, 4, 3, 2, 1)>>>takes_any_args(["first","list"],["another","list"])Type of args: <class 'tuple'>Value of args: (['first', 'list'], ['another', 'list'])
args is a tuple, composed of those arguments passed, in order. Which
means if you call the function with no arguments, args will be an
empty tuple.
This is different from declaring a function that takes a single argument, which happens to be of type list or tuple:
>>>deftakes_a_list(items):...("Type of items: "+str(type(items)))...("Value of items: "+str(items))...>>>takes_a_list(["x","y","z"])Type of items: <class 'list'>Value of items: ['x', 'y', 'z']>>>takes_any_args(["x","y","z"])Type of args: <class 'tuple'>Value of args: (['x', 'y', 'z'],)
In these calls to takes_a_list() and takes_any_args(), the argument
items is a list of strings. We’re calling both functions the exact
same way, but what happens inside each function is different. Within
takes_any_args(), the tuple named args has one element—and that
element is the list ["x", "y", "z"]. But in takes_a_list(), items
is the list itself.
This *args idiom gives you some very helpful programming
patterns. You can work with arguments as an abstract sequence, while
providing a potentially more natural interface for whomever calls the
function.
Above, I always name the argument args in the function signature.
Writing *args is a well-followed convention, but you can choose a
different name—the asterisk is what makes it a variable
argument. For instance, this takes paths of several files as
arguments:
defread_files(*paths):data=""forpathinpaths:withopen(path)ashandle:data+=handle.read()returndata
Most Python programmers use *args unless there is a reason to name
that variable something else.1 That reason is usually readability;
read_files() is a good example. If naming it something other than
args makes the code more understandable, do it.
The star modifier works in the other direction too. Intriguingly, you can use it with any function. For example, suppose a library provides this function:
deforder_book(title,author,isbn):"""Place an order for a book."""(f"Ordering '{title}' by{author}({isbn})")# ...
Notice there’s no asterisk. Suppose in another, completely different library, you fetch the book info from this function:
defget_required_textbook(class_id):"""Returns a tuple (title, author, ISBN)"""# ...
Again, no asterisk. Now, one way you can bridge these two functions is
to store the tuple result from get_required_textbook(), then
unpack it element by element:
>>>book_info=get_required_textbook(4242)>>>order_book(book_info[0],book_info[1],book_info[2])Ordering 'Writing Great Code' by Randall Hyde (1593270038)
Writing code this way is tedious and error-prone. Fortunately, Python provides a better way. Let’s look at a different function:
defnormal_function(a,b,c):(f"a:{a}b:{b}c:{c}")
No trick here—it really is a normal, boring function, taking three arguments. If we have those three arguments as a list or tuple, Python can automatically “unpack” them for us. We just need to pass in that collection, prefixed with an asterisk:
>>>numbers=(7,5,3)>>>normal_function(*numbers)a: 7 b: 5 c: 3
Again, normal_function is just a regular function. We did not use
an asterisk on the def line. But when we call it, we take a tuple
called numbers, and pass it in with the asterisk in front. This is
then unpacked within the function to the arguments a, b, and
c.
There is a duality here. We can use the asterisk syntax both in defining a function, and in calling a function. The syntax looks very similar. But realize they are doing two different things. One is packing arguments into a tuple automatically—called variable arguments; the other is un-packing them—called argument unpacking. Be clear on the distinction between the two in your mind.
Armed with this complete understanding, we can bridge the two book functions in a much better way:
>>>book_info=get_required_textbook(4242)>>>order_book(*book_info)Ordering 'Writing Great Code' by Randall Hyde (1593270038)
This is more concise (less tedious to type), and more maintainable. As you get used to the concept, you’ll find it increasingly natural and easy to use in the code you write.
So far we have just looked at functions with positional
arguments—the kind where you declare a function like def foo(a,
b):, and then invoke it like foo(7, 2). You know that a equal 7 and b equals 2
within the function, because of the order of the arguments. Of course,
Python also has keyword arguments:
>>>defget_rental_cars(size,doors=4,...transmission='automatic'):...template="Looking for a{}-door{}car with{}transmission...."...(template.format(doors,size,transmission))...>>>get_rental_cars("economy",transmission='manual')Looking for a 4-door economy car with manual transmission....
And remember, Python lets you call functions like this just using keyword arguments:
>>>defbar(x,y,z):...returnx+y*z...>>>bar(z=2,y=3,x=4)10
These keyword arguments won’t be captured by the *args
idiom. Instead, Python provides a different syntax—using two
asterisks instead of one:
defprint_kwargs(**kwargs):forkey,valueinkwargs.items():(f"{key}->{value}")
The variable kwargs is a dictionary. (In contrast to args;
remember, that was a tuple.) It’s just a regular dict, so we can
iterate through its key-value pairs with .items():
>>>print_kwargs(hero="Homer",antihero="Bart",genius="Lisa")hero -> Homerantihero -> Bartgenius -> Lisa
The arguments to print_kwargs() are key-value pairs. This is regular
Python syntax for calling functions; what’s interesting is happening
inside the function. There, a variable called kwargs is
defined. It’s a Python dictionary, consisting of the key-value pairs
passed in when the function was called.
Here’s another example, which has a regular positional argument, followed by arbitrary key-value pairs:
defset_config_defaults(config,**kwargs):forkey,valueinkwargs.items():# Do not overwrite existing values.ifkeynotinconfig:config[key]=value
This is perfectly valid. You can define a function that takes some normal arguments, followed by zero or more key-value pairs:
>>>config={"verbosity":3,"theme":"Blue Steel"}>>>set_config_defaults(config,bass=11,verbosity=2)>>>config{'verbosity': 3, 'theme': 'Blue Steel', 'bass': 11}
Like with *args, naming this variable kwargs is just a strong
convention; you can choose a different name if that improves
readability.
Just like *args, double-star works the other way too. We can
take a regular function, and pass it a dictionary using two asterisks:
>>>defnormal_function(a,b,c):...(f"a:{a}b:{b}c:{c}")...>>>numbers={"a":7,"b":5,"c":3}>>>normal_function(**numbers)a: 7 b: 5 c: 3
The keys of the dictionary must match up with how the function was declared. Otherwise you get an error:
>>>bad_numbers={"a":7,"b":5,"z":3}>>>normal_function(**bad_numbers)Traceback (most recent call last):File"<stdin>", line1, in<module>TypeError:normal_function() got an unexpected keyword argument 'z'
This is called keyword argument unpacking. It works regardless of whether that function has default values for some of its arguments or not. So long as the value of each argument is specified one way or another, you have valid code:
>>>defanother_function(x,y,z=2):...(f"x:{x}y:{y}z:{z}")...>>>all_numbers={"x":2,"y":7,"z":10}>>>some_numbers={"x":2,"y":7}>>>missing_numbers={"x":2}>>>another_function(**all_numbers)x: 2 y: 7 z: 10>>>another_function(**some_numbers)x: 2 y: 7 z: 2>>>another_function(**missing_numbers)Traceback (most recent call last):File"<stdin>", line1, in<module>TypeError:another_function() missing 1 required positional argument: 'y'
You can combine the syntax to use both
positional and keyword arguments. In a function signature, just
separate *args and **kwargs by a comma:
defgeneral_function(*args,**kwargs):forarginargs:(arg)forkey,valueinkwargs.items():(f"{key}->{value}")
>>>general_function("foo","bar",x=7,y=33)foobarx -> 7y -> 33
This usage—declaring a function like def general_function(*args,
**kwargs)—is the most general way to define a function in Python. A
function so declared can be called in any way, with any valid
combination of keyword and nonkeyword arguments—including no
arguments.
Similarly, you can call a function using both—and both will be unpacked:
>>>defaddup(a,b,c=1,d=2,e=3):...returna+b+c+d+e...>>>nums=(3,4)>>>extras={"d":5,"e":2}>>>addup(*nums,**extras)15
There’s one last point to understand, on argument ordering. When you
def the function, you specify the arguments in this order:
Positional arguments (nonkeyword) arguments
The *args nonkeyword variable arguments
The **kwargs keyword variable arguments
You can omit any of these when defining a function. But any that are present must be in this order.
# All these are valid function definitions.defcombined1(a,b,*args):passdefcombined2(x,y,z,**kwargs):passdefcombined3(*args,**kwargs):passdefcombined4(x,*args):passdefcombined5(u,v,w,*args,**kwargs):pass
Violating this order will cause errors:
>>>defbad_combo(**kwargs,*args):passFile"<stdin>", line1defbad_combo(**kwargs,*args):pass^SyntaxError:invalid syntax
In Python, functions are ordinary objects—just like integers, lists, or instances of a class you create. The implications are profound, letting you do certain very useful things with functions. Leveraging this is one of those secrets separating average Python developers from great ones, because of the extremely powerful abstractions which follow.
Once you get this, it can change the way you write software forever. In fact, these advanced patterns for using functions in Python largely transfer to other languages you will use in the future.
To explain, let’s start by laying out a problem and solution. Imagine you have a list of strings representing numbers:
nums=["12","7","30","14","3"]
Suppose we want to find the biggest integer in this list. The max()
built-in does not help:
>>>max(nums)'7'
This isn’t a bug, of course; since the objects in nums are strings,
max() compares each element lexicographically.2
By that criteria, “7” is greater than “30”, for the same reason “g”
comes after “ca” alphabetically. Essentially, max() is evaluating each
element by a different criterion than what we want.
Since max()’s algorithm is simple, let’s roll our own that compares
based on the integer value of the string:
defmax_by_int_value(items):# For simplicity, assume len(items) > 0biggest=items[0]foriteminitems[1:]:ifint(item)>int(biggest):biggest=itemreturnbiggest
>>>max_by_int_value(nums)'30'
This gives us what we want: it returns the element in the original list which is maximal, as evaluated by our criteria.
Now imagine working with different data where you have different criteria, for example, a list of actual integers:
integers=[3,-2,7,-1,-20]
Suppose we want to find the number with the greatest absolute value—i.e., distance from zero. That would be −20 here, but standard max()
won’t do that:
>>>max(integers)7
Again, let’s roll our own, using the built-in abs function:
defmax_by_abs(items):biggest=items[0]foriteminitems[1:]:ifabs(item)>abs(biggest):biggest=itemreturnbiggest
>>>max_by_abs(integers)-20
One more example—a list of dictionary objects:
student_joe={'gpa':3.7,'major':'physics','name':'Joe Smith'}student_jane={'gpa':3.8,'major':'chemistry','name':'Jane Jones'}student_zoe={'gpa':3.4,'major':'literature','name':'Zoe Fox'}students=[student_joe,student_jane,student_zoe]
Now, what if we want the record of the student with the highest GPA? Here’s a suitable max function:
defmax_by_gpa(items):biggest=items[0]foriteminitems[1:]:ifitem["gpa"]>biggest["gpa"]:biggest=itemreturnbiggest
>>>max_by_gpa(students){'name': 'Jane Jones', 'gpa': 3.8, 'major': 'chemistry'}
Just one line of code is different between max_by_int_value(),
max_by_abs(), and max_by_gpa(): the comparison
line. max_by_int_value() compares int(item) to int(biggest);
max_by_abs() compares abs(item) to abs(biggest); and max_by_gpa()
compares item["gpa"] to biggest["gpa"]. Other than that, these
max() functions are identical.
I don’t know about you, but having nearly identical functions like
this drives me nuts. The way out is to realize the comparison is
based on a value derived from the element—not the value of the
element itself. In other words: in each cycle through the for loop, the
two elements are not themselves compared. What is compared is some
derived, calculated value: int(item), or abs(item), or
item["gpa"].
It turns out we can abstract out that calculation, using what we’ll
call a key function. A key function is a function that takes exactly
one argument—an element in the list. It returns the derived value
used in the comparison. In fact, int works like a function, even
though it’s technically a type, because int("42") returns
42.3 So types and other callables work, as long
as we can invoke it like a one-argument function.
This lets us define a very generic max function:
defmax_by_key(items,key):biggest=items[0]foriteminitems[1:]:ifkey(item)>key(biggest):biggest=itemreturnbiggest
>>># Old way:...max_by_int_value(nums)'30'>>># New way:...max_by_key(nums,int)'30'>>># Old way:...max_by_abs(integers)-20>>># New way:...max_by_key(integers,abs)-20
Pay attention: you are passing the function object itself—int and
abs. You are not invoking the key function in any direct way. In
other words, you write int, not int(). This function object is
then called as needed by max_by_key(), to calculate the derived value:
# key is actually int, abs, etc.ifkey(item)>key(biggest):
To sort the students by GPA, we need a function extracting the
“gpa” key from each student dictionary. There is no built-in function
that does this, but we can define our own and pass it in:
>>># Old way:...max_by_gpa(students){'gpa': 3.8, 'name': 'Jane Jones', 'major': 'chemistry'}>>># New way:...defget_gpa(who):...returnwho["gpa"]...>>>max_by_key(students,get_gpa){'gpa': 3.8, 'name': 'Jane Jones', 'major': 'chemistry'}
Again, notice get_gpa is a function object, and we are passing that
function itself to max_by_key. We never invoke get_gpa directly;
max_by_key does that automatically.
You may be realizing now just how powerful this can be. In Python, functions are simply objects—just as much as an integer, or a string, or an instance of a class is an object. You can store functions in variables, pass them as arguments to other functions, and even return them from other function and method calls. This all provides new ways for you to encapsulate and control the behavior of your code.
The Python standard library demonstrates some excellent ways to use such functional patterns. Let’s look at a key (ha!) example.
Earlier, you learned the built-in max() function doesn’t magically do what we want
when sorting a list of numbers-as-strings:
>>>nums=["12","7","30","14","3"]>>>max(nums)'7'
Again, this isn’t a bug. max() just compares elements according to
the data type, and "7" > "12" evaluates to True. But it turns out
max() is customizable. You can pass it a key function!
>>>max(nums,key=int)'30'
The value of key is a function taking one argument—an element in
the list—and returning a value for comparison. But max() isn’t the
only built-in that accepts a key function. min() and sorted() do as well:
>>># Default behavior......min(nums)'12'>>>sorted(nums)['12', '14', '3', '30', '7']>>>>>># And with a key function:...min(nums,key=int)'3'>>>sorted(nums,key=int)['3', '7', '12', '14', '30']
Many algorithms can be cleanly expressed using min(), max(), or
sorted(), along with an appropriate key function. Sometimes a built-in
(like int or abs) will provide what you need, but often you’ll
want to create a custom function. Since this is so commonly needed,
the operator module provides some helpers. Let’s revisit the example
of a list of student records.
>>>student_joe={'gpa':3.7,'major':'physics','name': 'Joe Smith'}>>>student_jane={'gpa':3.8,'major':'chemistry','name': 'Jane Jones'}>>>student_zoe={'gpa':3.4,'major':'literature','name': 'Zoe Fox'}>>>students=[student_joe,student_jane,student_zoe]>>>>>>defget_gpa(who):...returnwho["gpa"]...>>>sorted(students,key=get_gpa)[{'gpa': 3.4, 'major': 'literature', 'name': 'Zoe Fox'},{'gpa': 3.7, 'major': 'physics', 'name': 'Joe Smith'},{'gpa': 3.8, 'major': 'chemistry', 'name': 'Jane Jones'}]
This is effective, and a fine way to solve the problem. Alternatively,
the operator module’s itemgetter() creates and
returns a key function that looks up a named dictionary field:
>>>fromoperatorimportitemgetter>>>>>># Sort by GPA......sorted(students,key=itemgetter("gpa"))[{'gpa': 3.4, 'major': 'literature', 'name': 'Zoe Fox'},{'gpa': 3.7, 'major': 'physics', 'name': 'Joe Smith'},{'gpa': 3.8, 'major': 'chemistry', 'name': 'Jane Jones'}]>>>>>># Now sort by major:...sorted(students,key=itemgetter("major"))[{'gpa': 3.8, 'major': 'chemistry', 'name': 'Jane Jones'},{'gpa': 3.4, 'major': 'literature', 'name': 'Zoe Fox'},{'gpa': 3.7, 'major': 'physics', 'name': 'Joe Smith'}]
Notice itemgetter() is a function that creates and returns a
function—itself a good example of how to work with function
objects. In other words, the following two key functions are
completely equivalent:
# What we did above:defget_gpa(who):returnwho["gpa"]# Using itemgetter instead:fromoperatorimportitemgetterget_gpa=itemgetter("gpa")
This is how you use itemgetter() when the sequence elements are
dictionaries. It also works when the elements are tuples or lists.
Just pass a number index instead:
>>># Same data, but as a list of tuples....student_rows=[...("Joe Smith","physics",3.7),...("Jane Jones","chemistry",3.8),...("Zoe Fox","literature",3.4),...]>>>>>># GPA is the 3rd item in the tuple, i.e. index 2....# Highest GPA:...max(student_rows,key=itemgetter(2))('Jane Jones', 'chemistry', 3.8)>>>>>># Sort by major:...sorted(student_rows,key=itemgetter(1))[('Jane Jones', 'chemistry', 3.8),('Zoe Fox', 'literature', 3.4),('Joe Smith', 'physics', 3.7)]
operator also provides attrgetter() for
creating key functions based on an attribute of the element, and
methodcaller() for creating key functions
based off a method’s return value—useful when the sequence elements
are instances of your own class:
classStudent:def__init__(self,name,major,gpa):self.name=nameself.major=majorself.gpa=gpadef__repr__(self):returnf"{self.name}:{self.gpa}"
>>>student_objs=[...Student("Joe Smith","physics",3.7),...Student("Jane Jones","chemistry",3.8),...Student("Zoe Fox","literature",3.4),...]>>>fromoperatorimportattrgetter>>>sorted(student_objs,key=attrgetter("gpa"))[Zoe Fox: 3.4, Joe Smith: 3.7, Jane Jones: 3.8]
While these are sometimes directly useful, they are also valuable as examples of good key functions. They demonstrate how to design key functions for real use cases, so you can see when you benefit from creating your own.
Every coder knows the common ways of using functions. But most only scratch the surface of what you can do with them. Much power is unlocked when you start thinking of functions as just another object type. In addition to the useful patterns in this chapter, it is also the foundation of more powerful metaprogramming techniques. You will learn one of the most important ones in the next chapter.
1 This seems to be deeply ingrained; once I abbreviated it *a, only to have my code reviewer demand I change it to *args. They wouldn’t approve my pull request until I changed it, so I did.
2 Meaning: alphabetically, but generalizing beyond the letters of the alphabet.
3 Python uses the word callable to describe something that can be called like a function. This can be an actual function, a type or class name, or an object defining the __call__ magic method. Key functions are frequently actual functions, but they can be any callable.