Chapter 11. Tuples

This chapter introduces one more built-in type, the tuple, and then shows how lists, dictionaries, and tuples work together. It also presents tuple assignment and a useful feature for functions with variable-length argument lists: the packing and unpacking operators.

In the exercises, we’ll use tuples, along with lists and dictionaries, to solve more word puzzles and implement efficient algorithms.

One note: there are two ways to pronounce “tuple.” Some people say “tuh-ple,” which rhymes with “supple.” But in the context of programming, most people say “too-ple,” which rhymes with “quadruple.”

Tuples Are Like Lists

A tuple is a sequence of values. The values can be any type, and they are indexed by integers, so tuples are a lot like lists. The important difference is that tuples are immutable.

To create a tuple, you can write a comma-separated list of values:

t = 'l', 'u', 'p', 'i', 'n'
type(t)
       
tuple
       

Although it is not necessary, it is common to enclose tuples in parentheses:

t = ('l', 'u', 'p', 'i', 'n')
type(t)
       
tuple
       

To create a tuple with a single element, you have to include a final comma:

t1 = 'p',
type(t1)
       
tuple
       

A single value in parentheses is not a tuple:

t2 = ('p')
type(t2)
       
str
       

Another way to create a tuple is the built-in function tuple. With no argument, it creates an empty tuple:

t = tuple()
t
       
()
       

If the argument is a sequence (string, list, or tuple), the result is a tuple with the elements of the sequence:

t = tuple('lupin')
t
       
('l', 'u', 'p', 'i', 'n')
       

Because tuple is the name of a built-in function, you should avoid using it as a variable name.

Most list operators also work with tuples. For example, the bracket operator indexes an element:

t[0]
       
'l'
       

And the slice operator selects a range of elements:

t[1:3]
       
('u', 'p')
       

The + operator concatenates tuples:

tuple('lup') + ('i', 'n')
       
('l', 'u', 'p', 'i', 'n')
       

And the * operator duplicates a tuple a given number of times:

tuple('spam') * 2 
        
('s', 'p', 'a', 'm', 's', 'p', 'a', 'm')
        

The sorted function works with tuples—but the result is a list, not a tuple:

sorted(t)
        
['i', 'l', 'n', 'p', 'u']
        

The reversed function also works with tuples:

reversed(t)
        
<reversed at 0x7f56c0072110>
        

The result is a reversed object, which we can convert to a list or tuple:

tuple(reversed(t))
        
('n', 'i', 'p', 'u', 'l')
        

Based on the examples so far, it might seem like tuples are the same as lists.

Tuple Assignment

You can put a tuple of variables on the left side of an assignment, and a tuple of values on the right:

a, b = 1, 2
        

The values are assigned to the variables from left to right—in this example, a gets the value 1, and b gets the value 2. We can display the results like this:

a, b
        
(1, 2)
        

More generally, if the left side of an assignment is a tuple, the right side can be any kind of sequence—string, list, or tuple. For example, to split an email address into a username and a domain, you could write:

email = 'monty@python.org'
username, domain = email.split('@')
        

The return value from split is a list with two elements—the first element is assigned to username, the second to domain:

username, domain
        
('monty', 'python.org')
        

The number of variables on the left and the number of values on the right have to be the same—otherwise you get a ValueError:

a, b = 1, 2, 3
        
ValueError: too many values to unpack (expected 2)
        

Tuple assignment is useful if you want to swap the values of two variables. With conventional assignments, you have to use a temporary variable, like this:

temp = a
a = b
b = temp
        

That works, but with tuple assignment we can do the same thing without a temporary variable:

a, b = b, a
        

This works because all of the expressions on the right side are evaluated before any of the assignments.

We can also use tuple assignment in a for statement. For example, to loop through the items in a dictionary, we can use the items method:

d = {'one': 1, 'two': 2}

for item in d.items():
    key, value = item
    print(key, '->', value)
        
one -> 1
two -> 2
        

Each time through the loop, item is assigned a tuple that contains a key and the corresponding value.

We can write this loop more concisely, like this:

for key, value in d.items():
    print(key, '->', value)
        
one -> 1
two -> 2
        

Each time through the loop, a key and the corresponding value are assigned directly to key and value.

Argument Packing

Functions can take a variable number of arguments. A parameter name that begins with the * operator packs arguments into a tuple. For example, the following function takes any number of arguments and computes their arithmetic mean—that is, their sum divided by the number of arguments:

def mean(*args):
    return sum(args) / len(args)
        

The parameter can have any name you like, but args is conventional. We can call the function like this:

mean(1, 2, 3)
        
2.0
        

If you have a sequence of values and you want to pass them to a function as multiple arguments, you can use the * operator to unpack the tuple. For example, divmod takes exactly two arguments—if you pass a tuple as a parameter, you get an error:

t = (7, 3)
divmod(t)
        
TypeError: divmod expected 2 arguments, got 1
        

Even though the tuple contains two elements, it counts as a single argument. But if you unpack the tuple, it is treated as two arguments:

divmod(*t)
        
(2, 1)
        

Packing and unpacking can be useful if you want to adapt the behavior of an existing function. For example, this function takes any number of arguments, removes the lowest and highest, and computes the mean of the rest:

def trimmed_mean(*args):
    low, high = min_max(args)
    trimmed = list(args)
    trimmed.remove(low)
    trimmed.remove(high)
    return mean(*trimmed)
        

First, it uses min_max to find the lowest and highest elements. Then it converts args to a list so it can use the remove method. Finally, it unpacks the list so the elements are passed to mean as separate arguments, rather than as a single list.

Here’s an example that shows the effect:

mean(1, 2, 3, 10)
        
4.0
        
trimmed_mean(1, 2, 3, 10)
        
2.5
        

This kind of “trimmed” mean is used in some sports with subjective judging—like diving and gymnastics—to reduce the effect of a judge whose score deviates from the others.

Zip

Tuples are useful for looping through the elements of two sequences and performing operations on corresponding elements. For example, suppose two teams play a series of seven games, and we record their scores in two lists, one for each team:

scores1 = [1, 2, 4, 5, 1, 5, 2]
scores2 = [5, 5, 2, 2, 5, 2, 3]
        

Let’s see how many games each team won. We’ll use zip, which is a built-in function that takes two or more sequences and returns a zip object, so-called because it pairs up the elements of the sequences like the teeth of a zipper:

zip(scores1, scores2)
        
<zip at 0x7f3e9c74f0c0>
        

We can use the zip object to loop through the values in the sequences pairwise:

for pair in zip(scores1, scores2):
    print(pair)
        
(1, 5)
(2, 5)
(4, 2)
(5, 2)
(1, 5)
(5, 2)
(2, 3)
        

Each time through the loop, pair gets assigned a tuple of scores. So we can assign the scores to variables, and count the victories for the first team, like this:

wins = 0
for team1, team2 in zip(scores1, scores2):
    if team1 > team2:
        wins += 1
                
wins
        
3
        

Sadly, the first team won only three games and lost the series.

If you have two lists and you want a list of pairs, you can use zip and list:

t = list(zip(scores1, scores2))
t
        
[(1, 5), (2, 5), (4, 2), (5, 2), (1, 5), (5, 2), (2, 3)]
        

The result is a list of tuples, so we can get the result of the last game like this:

t[-1]
        
(2, 3)
        

If you have a list of keys and a list of values, you can use zip and dict to make a dictionary. For example, here’s how we can make a dictionary that maps from each letter to its position in the alphabet:

letters = 'abcdefghijklmnopqrstuvwxyz'
numbers = range(len(letters))
letter_map = dict(zip(letters, numbers))
        

Now we can look up a letter and get its index in the alphabet:

letter_map['a'], letter_map['z']
        
(0, 25)
        

In this mapping, the index of 'a' is 0, and the index of 'z' is 25.

If you need to loop through the elements of a sequence and their indices, you can use the built-in function enumerate:

enumerate('abc')
        
<enumerate at 0x7f3e9c620cc0>
        

The result is an enumerate object that loops through a sequence of pairs, where each pair contains an index (starting from 0) and an element from the given sequence:

for index, element in enumerate('abc'):
    print(index, element)
        
0 a
1 b
2 c
        

Comparing and Sorting

The relational operators work with tuples and other sequences. For example, if you use the < operator with tuples, it starts by comparing the first element from each sequence. If they are equal, it goes on to the next pair of elements, and so on, until it finds a pair that differ:

(0, 1, 2) < (0, 3, 4)
        
True
        

Subsequent elements are not considered—even if they are really big:

(0, 1, 2000000) < (0, 3, 4)
        
True
        

This way of comparing tuples is useful for sorting a list of tuples, or finding the minimum or maximum. As an example, let’s find the most common letter in a word. In Chapter 10, we wrote value_counts, which takes a string and returns a dictionary that maps from each letter to the number of times it appears:

def value_counts(string):
    counter = {}
    for letter in string:
        if letter not in counter:
            counter[letter] = 1
        else:
            counter[letter] += 1
    return counter
        

Here is the result for the string 'banana':

counter = value_counts('banana')
counter
        
{'b': 1, 'a': 3, 'n': 2}
        

With only three items, we can easily see that the most frequent letter is 'a', which appears three times. But if there were more items, it would be useful to sort them automatically. We can get the items from counter like this:

items = counter.items()
items
        
dict_items([('b', 1), ('a', 3), ('n', 2)])
        

The result is a dict_items object that behaves like a list of tuples, so we can sort it, like this:

sorted(items)
        
[('a', 3), ('b', 1), ('n', 2)]
        

The default behavior is to use the first element from each tuple to sort the list, and use the second element to break ties.

However, to find the items with the highest counts, we want to use the second element to sort the list. We can do that by writing a function that takes a tuple and returns the second element:

def second_element(t):
    return t[1]
        

Then we can pass that function to sorted as an optional argument called key, which indicates that this function should be used to compute the sort key for each item:

sorted_items = sorted(items, key=second_element)
sorted_items
        
[('b', 1), ('n', 2), ('a', 3)]
        

The sort key determines the order of the items in the list. The letter with the lowest count appears first, and the letter with the highest count appears last. So we can find the most common letter like this:

sorted_items[-1]
        
('a', 3)
        

If we only want the maximum, we don’t have to sort the list. We can use max, which also takes key as an optional argument:

max(items, key=second_element)
        
('a', 3)
        

To find the letter with the lowest count, we could use min the same way.

Inverting a Dictionary

Suppose you want to invert a dictionary so you can look up a value and get the corresponding key. For example, if you have a word counter that maps from each word to the number of times it appears, you could make a dictionary that maps from integers to the words that appear that number of times.

But there’s a problem—the keys in a dictionary have to be unique, but the values don’t. For example, in a word counter, there could be many words with the same count.

So one way to invert a dictionary is to create a new dictionary where the values are lists of keys from the original. As an example, let’s count the letters in parrot:

d =  value_counts('parrot')
d
        
{'p': 1, 'a': 1, 'r': 2, 'o': 1, 't': 1}
        

If we invert this dictionary, the result should be {1: ['p', 'a', 'o', 't'], 2: ['r']}, which indicates that the letters that appear once are 'p', 'a', 'o', and 't', and the letter that appears twice is 'r'.

The following function takes a dictionary and returns its inverse as a new dictionary:

def invert_dict(d):
    new = {}
    for key, value in d.items():
        if value not in new:
            new[value] = [key]
        else:
            new[value].append(key)
    return new
        

The for statement loops through the keys and values in d. If the value is not already in the new dictionary, it is added and associated with a list that contains a single element. Otherwise it is appended to the existing list.

We can test it like this:

invert_dict(d)
        
{1: ['p', 'a', 'o', 't'], 2: ['r']}
        

And we get the result we expected.

This is the first example we’ve seen where the values in the dictionary are lists. We will see more!

Debugging

Lists, dictionaries, and tuples are data structures. In this chapter we are starting to see compound data structures, like lists of tuples, or dictionaries that contain tuples as keys and lists as values. Compound data structures are useful, but they are prone to errors caused when a data structure has the wrong type, size, or structure. For example, if a function expects a list of integers and you give it a plain old integer (not in a list), it probably won’t work.

To help debug these kinds of errors, I wrote a module called structshape that provides a function, also called structshape, that takes any kind of data structure as an argument and returns a string that summarizes its structure. You can download it from https://raw.githubusercontent.com/AllenDowney/ThinkPython/v3/structshape.py.

We can import it like this:

from structshape import structshape
        

Here’s an example with a simple list:

t = [1, 2, 3]
structshape(t)
        
'list of 3 int'
        

Here’s a list of lists:

t2 = [[1,2], [3,4], [5,6]]
structshape(t2)
        
'list of 3 list of 2 int'
        

If the elements of the list are not the same type, structshape groups them by type:

t3 = [1, 2, 3, 4.0, '5', '6', [7], [8], 9]
structshape(t3)
        
'list of (3 int, float, 2 str, 2 list of int, int)'
        

Here’s a list of tuples:

s = 'abc'
lt = list(zip(t, s))
structshape(lt)
        
'list of 3 tuple of (int, str)'
        

And here’s a dictionary with three items that map integers to strings:

d = dict(lt) 
structshape(d)
        
'dict of 3 int->str'
        

If you are having trouble keeping track of your data structures, structshape can help.

Glossary

tuple: An immutable object that contains a sequence of values.

pack: Collect multiple arguments into a tuple.

unpack: Treat a tuple (or other sequence) as multiple arguments.

zip object: The result of calling the built-in function zip, can be used to loop through a sequence of tuples.

enumerate object: The result of calling the built-in function enumerate, can be used to loop through a sequence of tuples.

sort key: A value, or function that computes a value, used to sort the elements of a collection.

data structure: A collection of values, organized to perform certain operations efficiently.

Exercises

Ask a Virtual Assistant

The exercises in this chapter might be more difficult than exercises in previous chapters, so I encourage you to get help from a virtual assistant. When you pose more difficult questions, you might find that the answers are not correct on the first attempt, so this is a chance to practice crafting good prompts and following up with good refinements.

One strategy you might consider is to break a big problem into pieces that can be solved with simple functions. Ask the virtual assistant to write the functions and test them. Then, once they are working, ask for a solution to the original problem.

For some of the following exercises, I make suggestions about which data structures and algorithms to use. You might find these suggestions useful when you work on the problems, but they are also good prompts to pass along to a virtual assistant.

Exercise

In this chapter I said that tuples can be used as keys in dictionaries because they are hashable, and they are hashable because they are immutable. But that is not always true.

If a tuple contains a mutable value, like a list or a dictionary, the tuple is no longer hashable because it contains elements that are not hashable. As an example, here’s a tuple that contains two lists of integers:

list0 = [1, 2, 3]
list1 = [4, 5]

t = (list0, list1)
t
        
([1, 2, 3], [4, 5])
        

Write a line of code that appends the value 6 to the end of the second list in t. If you display t, the result should be ([1, 2, 3], [4, 5, 6]):

t[1].append(6)
t
        
([1, 2, 3], [4, 5, 6])
        

Try to create a dictionary that maps from t to a string, and confirm that you get a TypeError.

For more on this topic, ask a virtual assistant, “Are Python tuples always hashable?”

Exercise

In this chapter we made a dictionary that maps from each letter to its index in the alphabet:

letters = 'abcdefghijklmnopqrstuvwxyz'
numbers = range(len(letters))
letter_map = dict(zip(letters, numbers))
        

For example, the index of 'a' is 0:

letter_map['a']
        
0
        

To go in the other direction, we can use list indexing. For example, the letter at index 1 is 'b':

letters[1]
        
'b'
        

We can use letter_map and letters to encode and decode words using a Caesar cipher.

A Caesar cipher is a weak form of encryption that involves shifting each letter by a fixed number of places in the alphabet, wrapping around to the beginning if necessary. For example, 'a' shifted by 2 is 'c', and 'z' shifted by 1 is 'a'.

Write a function called shift_word that takes as parameters a string and an integer, and returns a new string that contains the letters from the string shifted by the given number of places.

To test your function, confirm that “cheer” shifted by 7 is “jolly,” and “melon” shifted by 16 is “cubed.”

Hint: use the modulus operator to wrap around from 'z' back to 'a'. Loop through the letters of the word, shift each one, and append the result to a list of letters. Then use join to concatenate the letters into a string.

Exercise

Write a function called most_frequent_letters that takes a string and prints the letters in decreasing order of frequency.

To get the items in decreasing order, you can use reversed along with sorted or you can pass reverse=True as a keyword parameter to sorted.

Exercise

In a previous exercise, we tested whether two strings are anagrams by sorting the letters in both words and checking whether the sorted letters are the same. Now let’s make the problem a little more challenging.

We’ll write a program that takes a list of words and prints all the sets of words that are anagrams. Here is an example of what the output might look like:

['deltas', 'desalt', 'lasted', 'salted', 'slated', 'staled']
['retainers', 'ternaries']
['generating', 'greatening']
['resmelts', 'smelters', 'termless']

Hint: for each word in the word list, sort the letters and join them back into a string. Make a dictionary that maps from this sorted string to a list of words that are anagrams of it.

Exercise

Write a function called word_distance that takes two words with the same length and returns the number of places where the two words differ.

Hint: use zip to loop through the corresponding letters of the words.