python arbitrarily incrementing an iterator inside a loop

Question or problem about Python programming:

I am probably going about this in the wrong manner, but I was wondering how to handle this in python.

First some c code:

int i;

for(i=0;i<100;i++){
  if(i == 50)
    i = i + 10;
  printf("%i\n", i);
}

Ok so we never see the 50's...

My question is, how can I do something similar in python? For instance:

for line in cdata.split('\n'):
  if exp.match(line):
    #increment the position of the iterator by 5?
    pass
  print line

With my limited experience in python, I only have one solution, introduce a counter and another if statement. break the loop until the counter reaches 5 after exp.match(line) is true.

There has got to be a better way to do this, hopefully one that does not involve importing another module.

Thanks in advance!

How to solve the problem:

Solution 1:

There is a fantastic package in Python called itertools.

But before I get into that, it'd serve well to explain how the iteration protocol is implemented in Python. When you want to provide iteration over your container, you specify the __iter__() class method that provides an iterator type. "Understanding Python's 'for' statement" is a nice article covering how the for-in statement actually works in Python and provides a nice overview on how the iterator types work.

Take a look at the following:

>>> sequence = [1, 2, 3, 4, 5]
>>> iterator = sequence.__iter__()
>>> iterator.next()
1
>>> iterator.next()
2
>>> for number in iterator:
    print number 
3
4
5

Now back to itertools. The package contains functions for various iteration purposes. If you ever need to do special sequencing, this is the first place to look into.

At the bottom you can find the Recipes section that contain recipes for creating an extended toolset using the existing itertools as building blocks.

And there's an interesting function that does exactly what you need:

def consume(iterator, n):
    '''Advance the iterator n-steps ahead. If n is none, consume entirely.'''
    collections.deque(itertools.islice(iterator, n), maxlen=0)

Here's a quick, readable, example on how it works (Python 2.5):

>>> import itertools, collections
>>> def consume(iterator, n):
    collections.deque(itertools.islice(iterator, n))
>>> iterator = range(1, 16).__iter__()
>>> for number in iterator:
    if (number == 5):
        # Disregard 6, 7, 8, 9 (5 doesn't get printed just as well)
        consume(iterator, 4)
    else:
        print number

1
2
3
4
10
11
12
13
14
15

Solution 2:

itertools.islice:

lines = iter(cdata.splitlines())
for line in lines:
    if exp.match(line):
       #increment the position of the iterator by 5
       for _ in itertools.islice(lines, 4):
           pass
       continue # skip 1+4 lines
    print line

For example, if exp, cdata are:

exp = re.compile(r"skip5")
cdata = """
before skip
skip5
1 never see it
2 ditto
3 ..
4 ..
5 after skip
6 
"""

Then the output is:

before skip
5 after skip
6

Python implementation of the C example

i = 0
while i < 100:
    if i == 50:
       i += 10
    print i
    i += 1

As @[Glenn Maynard] pointed out in the comment if you need to do a very large jumps such as i += 100000000 then you should use explicit while loop instead of just skipping steps in a for loop.

Here's the example that uses explicit while loop instead islice:

lines = cdata.splitlines()
i = 0
while i < len(lines):
    if exp.match(lines[i]):
       #increment the position of the iterator by 5
       i += 5
    else:
       print lines[i]
       i += 1

This example produces the same output as the above islice example.

Solution 3:

If you're doing it with numbers a list comprehension can work:

for i in [x for x in range(0, 99) if x < 50 and x > 59]:
    print i

Moving an iterator forward is a bit more difficult though. I'd suggest setting your list up beforehand if you don't want to do the counter approach, probably by splitting cdata, then working out the indexes of the matching line and removing that line and the following ones. Apart from that you're stuck with the counter approach which isn't nearly as unpleasant as you make it out to be to be honest.

Another option is this:

iterator = iter(cdata.split('\n'))
for line in iterator:
    if exp.match(line):
        for i in range(0, 5):
            try:
                iterator.next()
            except StopIteration:
                break
    else:
        print line

Solution 4:

Not exactly sure I follow your thought process but here is something to feed on..

for i in range(len(cdata.split('\n'))):
  if i in range(50,60): continue
  line = cdata[i]
  if exp.match(line):
    #increment the position of the iterator by 5?
    pass
  print line

Not sure what you are really after but the range(len(..)) should help you.

Solution 5:

You can drop values from an iterator

def dropvalues(iterator, vals):
    for i in xrange(vals): iterator.next()

Now just make sure you have an iterator object to work on with lines = iter(cdata.split('\n')); and loop over it.

Solution 6:

for your example, as you're working with lists (indexable sequences) and not with iterators, I would recommend the following:

lines = cdata.split("\n")
for line in lines[:50]+lines[60:]:
  print line

that's not the most efficient since it potentially constructs 3 new lists (but if the skipped part is bigger that the processed part, it could be more efficient than the other options), but it's quite clean and explicit.

If you don't mind to use the itertools module, you can convert the lists to sequences easily:

from itertools import chain, islice
for line in chain(islice(lines, None, 50), islice(lines, 60,None)):
  print line

Solution 7:

Maybe with genexps. Not pretty but...

Something like that:

>>> gx = (line for line in '1 2 x 3 4 5 6 7 x 9 10 11 12 x 1'.split('\n'))
>>> for line in gx:
...   if line == 'x':
...      for i in range(2):
...          line = gx.next()
...   print line

The only problem is to ensure that gx can be next()-ed. The above example purposely generates an exception due to the last x.

Solution 8:

I can't parse the question vary well because there's this block of confusing and irrelevant C code. Please delete it.

Focusing on just the Python code and the question about how to skip 5 lines...

lineIter= iter( cdata.splitlines() )
for line in lineIter:
  if exp.match(line):
    for count in range(5):
        line = lineIter.next()
  print line