Reputation: 49
I would like to give you an example. If I am trying to print lines that contain the integer -9999
from a file.
19940325 78 -28 -9999
19940326 50 17 102
19940327 100 -11 -9999
19940328 56 -33 0
19940329 61 -39 -9999
19940330 61 -56 0
19940331 139 -61 -9999
19940401 211 6 0
here is my code that uses regex to read the text file and scans to find the integer -9999
and print only the line/lines that contains that integer.
import re
file= open("USC00110072.txt", "r")
for line in file.readlines():
if re.search('^-9999$', line, re.I):
print line
My code runs with error but doesn't show anything in the output. Please let me know what mistake i have made.
Upvotes: 0
Views: 120
Reputation: 103844
You can use filter
:
with open(fn) as f:
print filter(lambda line: '-9999' in line.split()[-1], f)
This is will check if '-9999' is in the final column of the line.
If you want to use a regex:
with open(fn) as f:
for line in f:
if re.search(r'-9999$', line): # remove $ if the -9999 can be anywhere in the line
print line.strip()
The ^
you have will never match except for a line that only contains -9999
and nothing else. The ^
indicates the start of the line.
Or, just use in
to test the presence of the string:
with open(fn) as f:
for line in f:
if '-9999' in line:
print line.strip()
Upvotes: 1
Reputation: 51807
Alternatively, since you have a csv
file you could use the csv
module:
import csv
import io
file = io.StringIO(u'''
19940325\t78\t-28\t-9999
19940326\t50\t17\t102
19940327\t100\t-11\t-9999
19940328\t56\t-33\t0
19940329\t61\t-39\t-9999
19940330\t61\t-56\t0
19940331\t139\t-61\t-9999
19940401\t211\t6\t0
'''.strip())
reader = csv.reader(file, delimiter='\t')
for row in reader:
if row[-1] == '-9999': # or, for regex, `re.match(r'^-9999$', row[-1])`
print('\t'.join(row))
Upvotes: 1
Reputation: 117876
Regex is likely overkill for this, a simple substring check using the in
operator seems sufficient
with open("USC00110072.txt") as f:
for line in f:
if '-9999' in line:
print(line)
Or if you're concerned about that matching that as a "whole word" you can do a little more to divide up the values
with open("USC00110072.txt") as f:
for line in f:
if '-9999' in line.strip().split('\t'):
print(line)
Upvotes: 3