pythonのitertoolsメモ - mfumiの日記

バージョン: python3.3
参考: http://docs.python.org/3/library/itertools.html
itertoolsのpythonによる実装が書いてあるので勉強になります．

・itertools.accumulate(iterable[, func])

In [2]: it = itertools.accumulate([1,2,3])

In [3]: for i in it:
   ...:     print(i)
   ...:
1
3   # 1+2
6   # 3+3

In [6]: it = itertools.accumulate([1,2,3],lambda x,y: x*y)

In [7]: for i in it:
   ...:     print(i)
   ...:
1
2    # 1*2
6    # 2*3

python3.2で追加されてfuncは3.3から指定できるようになったらしい．

・itertools.chain(*iterables)
iterableなアイテムをつなげたイテレータを返す

In [10]: it = itertools.chain([1,2,3],['a','b','c'])

In [11]: for i in it:
   ....:     print(i)
   ....:
1
2
3
a
b
c

・itertools.chain.from_iterable(iterable)
iteratools.chain(*iterables)と似てるけど引数が一つのリスト

In [13]: it = itertools.chain.from_iterable([[1,2,3],['a','b','c']])

In [14]: for i in it:
   ....:     print(i)
   ....:
1
2
3
a
b
c

・itertools.combinations(iterable,r)
長さrの組み合わせを返します

In [15]: it = itertools.combinations([1,2,3],2)

In [16]: for i in it:
   ....:     print(i)
   ....:
(1, 2)
(1, 3)
(2, 3)

・itertools.combinations_with_replacement(iterable,r)
重複を含めた長さrの組み合わせを返します

In [17]: it = itertools.combinations_with_replacement([1,2,3],2)

in [18]: for i in it:
   ....:     print(i)
   ....:
(1, 1)
(1, 2)
(1, 3)
(2, 2)
(2, 3)
(3, 3)

・itertools.compress(data,selectors)
selectorsがTrueの要素のdataについてのイテレータを返します
例を見た方が早い．

In [21]: it = itertools.compress(['A','B','C'],[True,False,True])

In [22]: for i in it:
   ....:     print(i)
   ....:
A
C

python3.1で追加

・itertools.count(start=0,step=1)
startから初めてstepずつ増えてくiteratorを返します

In [27]: def generator():
   ....:     for i in range(10):
   ....:         yield i

In [30]: for i,v in zip(generator(),itertools.count()):
    print("{0}:{1}".format(i,v))
   ....:
0:0
1:1
2:2
3:3
4:4
5:5
6:6
7:7
8:8
9:9

ちなみにpython3ではzip()がpython2のitertools.izip()に相当します．

・itetools.cycle(iterable)
iterableなものを巡回するイテレータを返します．

In [2]: it = itertools.cycle([1,2,3])

In [3]: next(it)
Out[3]: 1

In [4]: next(it)
Out[4]: 2

In [5]: next(it)
Out[5]: 3

In [6]: next(it)
Out[6]: 1

In [7]: next(it)
Out[7]: 2

In [8]: next(it)
Out[8]: 3
||<k


・itertools.dropwhile(predicate,iterable)
先頭から見ていって初めてpredicateがFalseになる要素の前までの要素を除いたイテレータを返します
>|python|
In [12]: it = itertools.dropwhile(lambda x : x < 5, [1,2,4,2,5,3,6,2])

In [13]: for i in it:
   ...:     print(i)
   ...:
5
3
6
2

・itertools.filterfalse(predicate,iterable)
iterableをpredicateでフィルターした結果を返すイテレータを返します

In [14]: it = itertools.filterfalse(lambda x : x < 5, [1,2,4,2,5,3,6,2])

In [15]: for i in it:
   ....:     print(i)
   ....:
5
6

・itetools.groupby(iterable,key=None)
連続する要素をkeyで比較して一致した場合には一つにまとめる．keyを指定しない場合は同じ要素が連続した場合一つにまとめます
uniqコマンド的な．
戻り値は2つあって，1つがキー，もう一つが連続する要素をまとめたもの（これもイテレータ）.

In [24]: for k,g in itertools.groupby('AAABDDDCCC'):
    print(k)
    print(list(g))
   ....:
A
['A', 'A', 'A']
B
['B']
D
['D', 'D', 'D']
C
['C', 'C', 'C']

・itertools.islice(iterable,stop), itertools.islice(iterable,start,stoip[,step])
iterableな要素の中で選択した範囲を返すイテレータを返します

In [25]: it = itertools.islice('ABCDEFG',1,5,2)

In [26]: for i in it:
   ....:     print(i)
   ....:
B
D

・itertools.permutation(iterable,r=None)
長さrの順列を返します．rを指定しない場合はiterableの長さがrになります．

In [27]: it = itertools.permutations('ABC')

In [28]: for i in it:
   ....:     print(i)
   ....:
('A', 'B', 'C')
('A', 'C', 'B')
('B', 'A', 'C')
('B', 'C', 'A')
('C', 'A', 'B')
('C', 'B', 'A')

・itertools.product(*iterables,repeat=1)
直積を返します．product([0,1],repeat=3)はproduct([0,1],[0,1],[0,1])と同じ意味．

In [34]: it = itertools.product([0,1],['x','y'])

In [35]: for i in it:
   ....:     print(i)
   ....:
(0, 'x')
(0, 'y')
(1, 'x')
(1, 'y')

・itertools.repeat(object[, times])
objectをtimes回繰り返すイテレータを返します．timesを指定しない場合無限リストになります．
主な使い方としてはmap()やzip()に一定値を与えるのに使うみたい

In [37]: list(map(pow,range(10),itertools.repeat(2)))
Out[37]: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

・itertools.starmap(function,iterable)
iterableな各要素についてfunctionで計算した結果のイテレータを返します．
map()に似てるけど要はmapの引数が既にzip化されてるときに使います．

In [46]: [i for i in itertools.starmap(pow,[(2,5),(3,2),(10,3)])]
Out[46]: [32, 9, 1000]

これと等価

In [50]: x = zip([2,3,10],[5,2,3])

In [51]: for i in itertools.starmap(pow,x):
    print(i)
    ....:
32
9
1000

ちなみにmap()を使うなら

In [52]: [i for i in map(pow,[2,3,10],[5,2,3])]
Out[53]: [32, 9, 1000]

・itertools.takewhile(predicate,iterable)
先頭から見ていってpredicateが初めてFalseになるまでの要素を返すイテレータを返す．

In [64]: it = itertools.dropwhile(lambda x:x<5, [1,2,3,5,2,1])

In [65]: for i in it:
   ....:     print(i)
   ....:
5
2
1

・itertools.tee(iterable,n = 2)
1つのiterableからn個のイテレータを返します．nのデフォルトは2です．

In [66]: it1,it2 = itertools.tee([1,2,3])

In [67]: for i in it1:
   ....:     print(i)
   ....:
1
2
3

In [68]: for i in it2:
   ....:     print(i)
   ....:
1
2
3

teeで複数イテレータを作ったあと元のiterableを操作することはイテレータが勝手に進んだりするのやめましょう．イテレータを複数作るのに比べてteeのメリットは何かというと，teeを使った方がバッファリングをするためI/Oが少なくなり，動作が速くなる可能性があります．逆に言うとメモリをよく消費します．このことはteeの実装を見ると分かります．

def tee(iterable, n=2):
    it = iter(iterable)
    deques = [collections.deque() for i in range(n)]
    def gen(mydeque):
        while True:
            if not mydeque:             # when the local deque is empty
                newval = next(it)       # fetch a new value and
                for d in deques:        # load it to all the deques
                    d.append(newval)
            yield mydeque.popleft()
    return tuple(gen(d) for d in deques)

この用に，teeで何をしているかというと，iterableから一つイテレータを作って，それからn個のdequeを用意して，いずれかのteeの戻り値のイテレータがが新しい値を取り出す度に他のdequeに値を追加していきます．

・itertools.zip_longest(*iterables,fillvalue=None)
zip()に似ていますが，zipと違うのはiterablesの長さが異なる場合，短いiterableが
終わるとそれ以降対応する場所はfillvalueで埋められます．

In [69]: it = itertools.zip_longest(['A','B','C','D'],[1,2],fillvalue='-')

In [70]: for i in it:
   ....:     print(i)
   ....:
('A', 1)
('B', 2)
('C', '-')
('D', '-')