Reputation: 9
I am having the following input txt file:
17,21.01.2019,0,0,0,0,E,75,meter tamper alarm
132,22.01.2019,64,296,225,996,A,,
150,23.01.2019,63,353,351,805,A,,
213,24.01.2019,64,245,244,970,A,,
201,25.01.2019,86,297,364,943,A,,
56,26.01.2019,73,678,678,1437,A,,
201,27.01.2019,83,654,517,1212,A,,
117,28.01.2019,58,390,202,816,A,,
69,29.01.2019,89,354,282,961,C,,
123,30.01.2019,53,267,206,852,A,,
Need to make a python program that can parse through the file. I need to find all the lines not containing A or C and output those lines in a new file. I'm completely stuck after trying several regex :( can you help me ?
Upvotes: 0
Views: 61
Reputation: 10960
Try
with open('filename') as f:
for line in f.readlines():
if 'A' not in line or 'C' not in line:
print(line)
OR Better, as your file content seems to resemble a csv(Comma Seperated Values) format, use pandas for better manipulations
Read the file
import pandas as pd
df = pd.read_csv('filename', header=None, sep=',')
0 1 2 3 4 5 6 7 8
0 17 21.01.2019 0 0 0 0 E 75.0 meter tamper alarm
1 132 22.01.2019 64 296 225 996 A NaN NaN
2 150 23.01.2019 63 353 351 805 A NaN NaN
3 213 24.01.2019 64 245 244 970 A NaN NaN
4 201 25.01.2019 86 297 364 943 A NaN NaN
5 56 26.01.2019 73 678 678 1437 A NaN NaN
6 201 27.01.2019 83 654 517 1212 A NaN NaN
7 117 28.01.2019 58 390 202 816 A NaN NaN
8 69 29.01.2019 89 354 282 961 C NaN NaN
9 123 30.01.2019 53 267 206 852 A NaN NaN
Output
print(df[~df[6].str.contains('A|C', regex=True)])
0 1 2 3 4 5 6 7 8
0 17 21.01.2019 0 0 0 0 E 75.0 meter tamper alarm
Upvotes: 3
Reputation: 4215
Try:
with open(r'file.txt', 'r') as f:
for line in f:
if 'A' not in line or 'C' not in line:
print(line)
Upvotes: 0