how a generator works
A Sentence class was discussed in the previous blog.
however, the versions of Sentence haven been created are not pythonic.
Let's see another version.
import re, reprlib
RE_WORD = re.compile("\w+")
class Sentence(object):
def __init__(self, txt):
self.words = RE_WORD.findall(txt)
def __repr__(self):
return "Sentence({})".format(reprlib.repr(self.words))
def __iter__(self):
for ele in self.words:
yield ele
return # do not need
the return above is not needed. the function just fall thought and return automatically. Either way, a generator function doesn't raise StopIteration Exception: it just exits when it's done producing values.
the __iter__ function above is a generator function, which return a generator and that is also an iterator.
what is a generator function?
any python function that has the yield keyword in its body is a generator function:a function which, when called, returns a generator object. In other words, a generator function is a generator factory.
for example:
def gen_123():
yield 1
yield 2
yield 3
>>> gen_123 # gen_123 is still a function object
<function gen_123 at 0x...>
>>> gen_123() # that is a generator now.
<generator object gen_123 at 0x...>
>>> for i in gen_123():
ptint(i)
1
2
3
>>> g = gen_123()
>>> next(g)
1
>>> next(g)
2
>>> next(g)
3
>>> next(g) # when generator is exhausted and applied next function, a StopIteration will be raised.
Traceback (most recent call last):
...
StopIteration
finally, when the function body returns, the enclosing generator object raises StopIteration, in accordance with the Iterator Protocol.
And the yield keyword in generator function suspends the process.
The following case will demonstrate it well.
>>> def gen_AB():
print("start...")
yield 'A'
print("continue")
yield 'B'
print("end")
>>> for c in gen_AB():
print('--->', c)
start...
---> A
continue
---> B
end
>>>
print("start...")
yield 'A'
print("continue")
yield 'B'
print("end")
>>> for c in gen_AB():
print('--->', c)
start...
---> A
continue
---> B
end
>>>
Obviously, the for loop machinery code catches the StopIteration Exception, and the loop terminates clearly.
So, the pythonic Sentence class is created:
class Sentence(object):
def __init__(self, txt):
self.txt = txt
def __repr__(self):
return "Sentence({})".format(reprlib.repr(self.text))
def __iter__(self):
for match in RE_WORD.finditer(self.txt):
yield match.group()
the finditer method also return an Iterator, that saves more memory.