Generator Expressions

Sum

Python has a number of built-in functions that act on iterables. List comprehensions return an iterable so we can pass list comprehensions straight into a one of these functions.

Let’s try out the sum function:

>>> numbers = [1, 2, 3, 4]
>>> sum(numbers)
10

Let’s sum the squares of all of the numbers in our numbers list. We can use a list comprehension:

>>> sum([n ** 2 for n in numbers])
30

Cool!

Generator Expressions

We can use sum with tuples, sets, and any other iterable:

>>> sum((8, 9, 7))
24
>>> sum({8, 9, 7})
24

Sometimes we don’t really care if a list comprehension returns a list, or some other kind of iterable. When we passed a list comprehension into sum, we only really needed to pass in an iterable, not necessarily a list.

Let’s use a generator expression instead of a list comprehension. We can make a generator expression like this:

>>> squares = (n ** 2 for n in numbers)
>>> squares
<generator object <genexpr> at 0x7f733d4f7e10>

We can use a generator expression in our sum call like this:

>>> sum((n ** 2 for n in numbers))
30

When our generator expression is already in parentheses, we can leave off the redundant parentheses:

>>> sum(n ** 2 for n in numbers)
30

Refactoring for Efficiency

Let’s refactor our code for storing reversible items from a dictionary file.

In our last iteration, we have three list comprehensions, and one set.

with open('dictionary.txt') as dictionary_file:
    words = [
        line.rstrip()
        for line in dictionary_file
    ]
    words_over_five_letters = [
        word
        for word in words
        if len(word) > 5
    ]

# Store the reverse of all long words
reversed_words = set()
for word in words_over_five_letters:
    reversed_words.add(word[::-1])

reversible_words = [
    word
    for word in words_over_five_letters
    if word in reversed_words
]

for word in reversible_words:
    print(word)

The word list comprehension is only looped over once, which makes it a prime candidate for a generator expression.

with open('dictionary.txt') as dictionary_file:
    words = (
        line.rstrip()
        for line in dictionary_file
    )
    words_over_five_letters = [
        word
        for word in words
        if len(word) > 5
    ]

Great! Our code is continuing to become more efficient and readable.

with open('dictionary.txt') as dictionary_file:
    words = (
        line.rstrip()
        for line in dictionary_file
    )
    words_over_five_letters = [
        word
        for word in words
        if len(word) > 5
    ]

# Store the reverse of all long words
reversed_words = set()
for word in words_over_five_letters:
    reversed_words.add(word[::-1])

reversible_words = [
    word
    for word in words_over_five_letters
    if word in reversed_words
]

for word in reversible_words:
    print(word)

Generator Review

List comprehensions are to lists, as generator expressions are to generators.

Remember that generators don’t work like other iterables because generators are iterators.

They can’t be indexed:

>>> squares[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'generator' object has no attribute '__getitem__'

And they can’t tell us their length:

>>> len(squares)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: object of type 'generator' has no len()

But you can loop over generators:

>>> for s in squares:
...     print(s)
...
1
4
9
16

But only once:

>>> for s in squares:
...     print(s)
...

Because generators are single-use iterables.

Do you remember how to loop over generators manually?

>>> squares = (n ** 2 for n in numbers)
>>> next(squares)
1
>>> next(squares)
4
>>> next(squares)
9
>>> next(squares)
16
>>> next(squares)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

So why are generator expressions called generator expressions? Why not generator comprehensions? I don’t know.

Calling them generator comprehensions is fine because people will know what you mean.

Iteration Tools

Let’s learn some more built-in functions for working with iterators.

If we want to make sure everything in our list conforms to a certain rule, we can use the all function for that.

>>> all(n > 1 for n in numbers)
False
>>> all(n > 0 for n in numbers)
True

If we want to only make sure that some of our list conforms to a certain rule, we can use the any function.

>>> any(n > 2 for n in numbers)
True
>>> any(n < 1 for n in numbers)
False

If we want to find the smallest or largest value in a collection, we can use min or max:

>>> min(numbers)
1
>>> max(numbers)
4

Generator Expression Exercises

These exercises are all in the generators.py file in the exercises directory. Edit the file to add the functions or fix the error(s) in the existing function(s). To run the test: from the exercises folder, type python test.py <function_name>, like this:

$ python test.py is_prime

Primality

Edit the function is_prime so that it returns True if a number is prime and False otherwise.

Example:

>>> from generators import is_prime
>>> is_prime(21)
False
>>> is_prime(23)
True

Hint

You might want to use any or all for this.

All Together

Edit the function all_together so that it takes any number of iterables and strings them together. Try using a generator expression to do it.

Example:

>>> from generators import all_together
>>> list(all_together([1, 2], (3, 4), "hello"))
[1, 2, 3, 4, 'h', 'e', 'l', 'l', 'o']
>>> nums = all_together([1, 2], (3, 4))
>>> list(all_together(nums, nums))
[1, 2, 3, 4]

Interleave

Edit the interleave function so that it accepts two iterables and returns a generator object with each of the given items “interleaved” (item 0 from iterable 1, then item 0 from iterable 2, then item 1 from iterable 1, and so on).

Example:

>>> from generators import interleave
>>> list(interleave([1, 2, 3, 4], [5, 6, 7, 8]))
[1, 5, 2, 6, 3, 7, 4, 8]
>>> nums = [1, 2, 3, 4]
>>> list(interleave(nums, (n**2 for n in nums)))
[1, 1, 2, 4, 3, 9, 4, 16]

Translate

Edit the function translate so that it takes a string in one language and transliterates each word into another language, returning the resulting string.

Here is an (over-simplified) example translation dictionary for translating from Spanish to English:

>>> words = {'esta': 'is', 'la': 'the', 'en': 'in', 'gato': 'cat', 'casa': 'house', 'el': 'the'}

Translate a sentence using your algorithm. An example of how this function should work:

>>> from generators import translate
>>> translate("el gato esta en la casa")
'the cat is in the house'

Parse Number Ranges

Edit the parse_ranges function so that it accepts a string containing ranges of numbers and returns a generator of the actual numbers contained in the ranges. The range numbers are inclusive.

It should work like this:

>>> from generators import parse_ranges
>>> parse_ranges('1-2,4-4,8-10')
[1, 2, 4, 8, 9, 10]
>>> parse_ranges('0-0,4-8,20-21,43-45')
[0, 4, 5, 6, 7, 8, 20, 21, 43, 44, 45]

Primes Over

Edit the function first_prime_over so that it returns the first prime number over a given number.

Example:

>>> from generators import first_prime_over
>>> first_prime_over(1000000)
1000003

Anagrams

Edit the function is_anagram so that it accepts two strings and returns True if the two strings are anagrams of each other. The function should use generator expressions. Make sure your function works with mixed case.

It should work like this:

>>> from generators import is_anagram
>>> is_anagram("tea", "eat")
True
>>> is_anagram("tea", "treat")
False
>>> is_anagram("sinks", "skin")
False
>>> is_anagram("Listen", "silent")
True

The function should also ignore spaces and punctuation:

>>> is_anagram("coins kept", "in pockets")
True
>>> is_anagram("a diet", "I'd eat")
True