Sanathana
Sanathana

Reputation: 284

Extract specific word and the value after it from text file

I have the input file as:

1 sentences, 6 words, 1 OOVs
1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0
7 words+sents, rank1wSent= 0 rank5wSent= 0 rank10wSent= 0 qloss= 0.925606 absloss= 0.856944

file input.txt : 1 sentences, 6 words, 1 OOVs
1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0
7 words+sents, rank1wSent= 0 rank5wSent= 0 rank10wSent= 0 qloss= 0.925606 absloss= 0.856944

I want to extract the word ppl and the value coming after it, in this case: ppl=15911.4

I am using this code:

with open("input.txt") as openfile:
    for line in openfile:
       for part in line.split():
          if "ppl=" in part:
              print part

However this is only extracting the word ppl but not the value. I would also like to print the file name along.

Expected Output:

input.txt, ppl=15911.4

How can I fix this?

Upvotes: 1

Views: 8424

Answers (3)

shahram kalantari
shahram kalantari

Reputation: 863

You can fix it by using a simple counter:

found = False
with open("input.txt") as openfile:
     for line in openfile:
         if not found:
             counter = 0
             for part in line.split():
                  counter = counter + 1
                  if "ppl=" in part:
                      print part
                      print line.split()[counter]
                      found = True

Upvotes: 2

Avinash Raj
Avinash Raj

Reputation: 174706

You may use enumerate function,

with open("input.txt") as openfile:
    for line in openfile:
       s = line.split()
       for i,j in enumerate(s):
          if j == "ppl=":
              print s[i],s[i+1]

Example:

>>> fil = '''1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0'''.splitlines()
>>> for line in fil:
        s = line.split()
        for i,j in enumerate(s):
            if j == "ppl=":
                print s[i],s[i+1]


ppl= 15911.4
>>> 

To print only the first value,

>>> for line in fil:
        s = line.split()
        for i,j in enumerate(s):
            if j == "ppl=":
                print s[i],s[i+1]
        break

ppl= 15911.4

Upvotes: 6

dkhamrick
dkhamrick

Reputation: 402

You could assign the list generated from line.split() to a variable, then use a while loop with i as a counter to iterate through and when you hit 'ppl=' you can return 'ppl=' and the next index

with open("input.txt") as openfile:
for line in openfile:
    phrases = line.split()
    i = 0
    while i < len(phrases):
        if 'ppl=' in phrases[i]
            print "ppl= " + str(phrases[i + 1])
        i += 1

Upvotes: 0

Related Questions