Sunday, February 12, 2012

Easy Permutation Calculation in Python

I was recently doing some Python programming, and had to compute all possible combinations of items from two separate lists. At first, I thought, I'll just do it this way:

list1 = ['a', 'b', 'c']
list2 = ['x', 'y', 'z']
for i in list1:
    for j in list2:
        print i, j
a x
a y
a z
b x
b y
b z
c x
c y
c z

However, my loop body was something that was a little more complicated than a print, and couldn't be done well inside of a loop like that. And besides - this is Python we're talking about. Surely there is a better way!

Well, there is - and don't call me Shirley. Details coming up after the break!

So, the first solution I found involved the itertools module in Python, and it looks like this:

import itertools
for i in itertools.product(list1, list2):
    print i
('a', 'x')
('a', 'y')
('a', 'z')
('b', 'x')
('b', 'y')
('b', 'z')
('c', 'x')
('c', 'y')
('c', 'z')

Note that i here is a tuple, containing the i and j from our first example as elements. If you wanted them separate, Python lets you do this:

for i, j in itertools.product(list1, list2):
    print i, j
a x
a y
a z
b x
b y
b z
c x
c y
c z

So, if that solves your problem, great! You can stop here. But if you want to know a little more about what's going on behind the scenes, keep reading...

Let's look into the details of itertools.product, to see what it's doing. The documentation says this:

Cartesian product of input iterables.

Equivalent to nested for-loops in a generator expression. For example, product(A, B) returns the same as ((x,y) for x in A for y in B).

Okay, so what do we get if we try that equivalent code?

((x,y) for x in list1 for y in list2)
<generator object <genexpr> at 0x01A454E0>

What the heck is that? Well, it's a generator object. Generator objects are basically lists that are lazily evaluated. The items in the list are not in memory, which can make them very nice if you have to iterate over some large objects that shouldn't all be in memory. Items in the list also aren't saved after they've been generated, so if you want to hang on to them, you'll have to stick them in a variable. If you want to iterate over your generator manually, use __next__():

a.__next__():
('a', 'x')
a.__next__():
 
('a', 'y')

Also, note that this is the same syntax you might use for a list comprehension, except for the square brackets:

[(x,y) for x in list1 for y in list2]
 
[('a', 'x'), ('a', 'y'), ('a', 'z'), ('b', 'x'), ('b', 'y'), ('b', 'z'), ('c', 'x'), ('c', 'y'), ('c', 'z')]
 

Pretty cool stuff. I may go into how you can make your own custom generator objects later, but for now, enjoy!

No comments:

Post a Comment