Comprehensions and Loops: How They Differ

When do we use them?

List comprehensions are a tool. Like all tools, you need to be able to identify opportunities to use them.

You can use list comprehensions whenever you see a “for loop” that loops over an iterable, transforming each item and adding it to a list.

Take this function:

>>> def square_all(numbers):
...     squared_numbers = []
...     for n in numbers:
...         squared_numbers.append(n * n)
...     return squared_numbers
...
>>> square_all([1, 2, 3, 4])
[1, 4, 9, 16]
>>> square_all((0, 1, 2))
[0, 1, 4]

You can see there is a “for loop” making one list (or any other iterable) into a new list.

We can rewrite this as a list comprehension:

>>> def square_all(numbers):
...     return [n * n for n in numbers]
...
>>> square_all([1, 2, 3, 4])
[1, 4, 9, 16]
>>> square_all((0, 1, 2))
[0, 1, 4]

We can also use list comprehensions whenever we see a “for loop” that also excludes some values, using an “if statement” to filter out values that don’t meet a condition.

Take this function:

>>> def only_truthy(things):
...     truthy_things = []
...     for x in things:
...         if x:
...             truthy_things.append(x)
...     return truthy_things
...
>>> only_truthy([1, 0, 2, False, "hello", True, ""])
[1, 2, 'hello', True]

That “if statement” can be transformed into the condition statement in a list comprehension.

We can rewrite this as a list comprehension like this:

>>> def only_truthy(things):
...     return [x for x in things if x]
...
>>> only_truthy([1, 0, 2, False, "hello", True, ""])
[1, 2, 'hello', True]

That looks a little strange with all those “x” variables in there. We have “x for x” because we’re not actually modifying each item in this list. We’re just filtering them out.

We can also modify the values at the same time:

>>> nums = [4, -1, 7, 9, 34, 0, -4, 3]
>>> new_nums = [x * 3 for x in nums if x > 0]
>>> new_nums
[12, 21, 27, 102, 9]

Nested Comprehensions

Let’s say we have some code that takes a list of lists of numbers and makes a new list of lists with all the numbers negated:

negative_matrix = []
for row in matrix:
    new_row = []
    for n in row:
        new_row.append(-n)
    negative_matrix.append(new_row)

You might notice that we have an empty list that we’re appending to repeatedly in our inner loop. Let’s copy-paste that into a comprehension:

negative_matrix = []
for row in matrix:
    negative_matrix.append([-n for n in row])

We have another empty list that we’re appending to in a loop. We could make that into a comprehension too:

negative_matrix = [
    [-n for n in row]
    for row in matrix
]

Take Another Look

Let’s look at a great example of breaking a complicated loop out into many loops, then making those loops into readable list comprehensions:

Here we’re looping over a dictionary file. The file we’re working with is simply one word per line:

aardvark
aardvarks
abaci
aback
abacus
abacuses
abaft
abalone
abalones
abandon
abandoned
abandoning
abandonment

Then we’re storing all words over five letters, and checking which words spell a different word in reverse.

reversed_words = {}
reversible_words = []

with open('dictionary.txt') as dictionary_file:
    for line in dictionary_file:
        word = line.rstrip()
        if len(word) > 5:
            if word in reversed_words:
                reversible_words.append(word)
            reversed_words.add(word)

for word in reversible_words:
    print(word)

Running this script works but it only shows us half the words:

$ python reversible.py
lamina
reffed
reined
relive
repaid
retool
reviled
reward
seined
sorter
spacer
spools
spoons
steels
stressed
strops
warder

We can turn this into multiple for loops, and give them each a comment to descibe what’s happening.

# Store all long words
 words_over_five_letters = []
 with open('dictionary.txt') as dictionary_file:
     for line in dictionary_file:
         word = line.rstrip()
         if len(word) > 5:
             words_over_five_letters.append(word)

 # Store the reverse of all long words
 reversed_words = set()
 for word in words_over_five_letters:
     reversed_words.add(word[::-1])

 # Find all "reversible" words (words whose reverse is also a word)
 reversible_words = []
 for word in words_over_five_letters:
     if word in reversed_words:
         reversible_words.append(word)

 # Print all words which are "reversible"
 for word in reversible_words:
     print(word)

We’re using more memory here, but this won’t be a problem for the reasonably-sized dictionary file we’re working with.

Writing our code this way breaks down the problem more obviously for readers of our code.

But. There’s more we can do. The list sections of code can be rewritten using comprehensions. Comprehensions tend to make code look less like looping and more like data processing.

We can split the first section into two loops and then copy-paste our way into list comprehensions:

with open('dictionary.txt') as dictionary_file:
    words = [
        line.rstrip()
        for line in dictionary_file
    ]
    words_over_five_letters = [
        word
        for word in words
        if len(word) > 5
    ]

We’ll leave this set loop as is for now, but we’ll see some tools later that can make this more readable and efficient.

# Store the reverse of all long words
reversed_words = set()
for word in words_over_five_letters:
    reversed_words.add(word[::-1])

The third block can be made into a list comprehension:

reversible_words = [
    word
    for word in words_over_five_letters
    if word in reversed_words
]

The last block should stay as a for loop because we’re not processing anything, we’re printing.

for word in reversible_words:
    print(word)

Notice that our code is fairly self-documenting when written this way. We’re simply doing assignments to variables over and over. Here it is in full:

with open('dictionary.txt') as dictionary_file:
    words = [
        line.rstrip()
        for line in dictionary_file
    ]
    words_over_five_letters = [
        word
        for word in words
        if len(word) > 5
    ]

# Store the reverse of all long words
reversed_words = set()
for word in words_over_five_letters:
    reversed_words.add(word[::-1])

reversible_words = [
    word
    for word in words_over_five_letters
    if word in reversed_words
]

for word in reversible_words:
    print(word)

These variable names are fairly descriptive so we don’t really need those comments in our code anymore.