Ben
Ben

Reputation: 16620

Python regex question

I'm having some problems figuring out a solution to this problem.

I want to read from a file on a per line basis and analyze whether that line has one of two characters (1 or 0). I then need to sum up the value of the line and also find the index value (location) of each of the "1" character instances.

so for example:

 1001

would result in:

line 1=(count:2, pos:[0,3])

I tried a lot of variations of something like this:

r=urllib.urlopen(remote-resouce)
list=[]
for line in lines:
    for m in re.finditer(r'1',line):
        list.append((m.start()))

I'm having two issues:

1) I thought that the best solution would be to iterate through each line and then use a regex finditer function. My issue here is that I keep failing to write a for loop that works. Despite my best efforts, I keep returning the results as one long list, rather than a multidimensional array of dictionaries.

Is this approach the right one? If so, how do I write the correct for loop?

If not, what else should I try?

Upvotes: 1

Views: 218

Answers (3)

unutbu
unutbu

Reputation: 879611

Perhaps do it without regex:

import urllib
url='http://stackoverflow.com/questions/5158168/python-regex-question/5158341'
f=urllib.urlopen(url)
for linenum,line in enumerate(f):
    print(line)
    locations=[pos for pos,char in enumerate(line) if char=='1']
    print('line {n}=(count:{c}, pos:{l})'.format(
        n=linenum,
        c=len(locations),
        l=locations
        ))

Upvotes: 4

Sumod
Sumod

Reputation: 3846

Unubtu's code works fine. I tested it on a sample file which also has all 0's for a particular line. Here is the complete code -


#! /usr/bin/python
  2 
  3 # Write a program to read a text file which has 1's and 0's on each line
  4 # For each line count the number of 1's and their position and print it
  5 
  6 import sys
  7 
  8 def countones(infile):
  9   f = open(infile,'r')
 10   for linenum, line in enumerate(f):
 11     locations = [pos for pos,char in enumerate(line) if char == '1']
 12     print('line {n}=(count:{c}, pos:{l})'.format(n=linenum,c=len(locations),l=    locations))
 13 
 14 
 15 def main():
 16   infile = './countones.txt'
 17   countones(infile)
 18 
 19 # Standard boilerplate to call the main() function to begin the program
 20 if __name__ == '__main__':
 21   main()


Input file -

1001
110001
111111
00001
010101
00000

Result -

line 0=(count:2, pos:[0, 3])
line 1=(count:3, pos:[0, 1, 5])
line 2=(count:6, pos:[0, 1, 2, 3, 4, 5])
line 3=(count:1, pos:[4])
line 4=(count:3, pos:[1, 3, 5])
line 5=(count:0, pos:[])

Upvotes: 0

Wooble
Wooble

Reputation: 89917

Using regexes here is probably a bad idea. You can see if a 1 or 0 is in a line of text with '0' in line or '1' in line, and you can get the count with line.count('1').

Finding all of the locations of 1s does require iterating through the string, I believe.

Upvotes: 1

Related Questions