Reputation: 234
I have generated a list with my other python code, which looks like this. there are lines separated by commas and they are in single quotes. I am trying hard to filter the lines based on D:
column match from another file, which has only starting number characters.
data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', 'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', 'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', 'A:SET, B:IT, C:AS, D:+368399793, E:+12355']
it looks likes this line by line in single quotes.
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT',
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
I have another file which has filtering numbers, to be matched in above lists/
cat fields.txt
+36
+18
#these are country prefixes
I need to match above lists D: column to "fields.txt" file starting numbers and print only those lines. Since "data" D:
col numbers vary every time, I need to filter based on their country prefix.
output expected:
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', ###matched as starting num +18 in D: col
'A:SET, B:IT, C:AS, D:+368399793, E:+12355' ###matched as starting num +36 in D: col
]
I have already tried various examples to write a "FOR" loop and match the nums but no luck.
please help me. I am new to Python programming.
Upvotes: 1
Views: 590
Reputation: 21757
You can do this with a list-comprehension with an included if
condition. This has the benefit that your logic which decides which line to include or exclude can be nicely tucked away in a separate function (matches
in the example below).
Having a separate function makes this very testable, you can add a docstring and it makes it much more maintainable.
data = [
"A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT",
"A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT",
"A:SET, B:FW.O, C:AS, D:+177232, E:+12355",
"A:SET, B:IT, C:AS, D:+368399793, E:+12355",
]
def load_codes():
with open("fields.txt") as fieldfile:
codes = fieldfile.read().splitlines()
return codes
def matches(row, codes):
for code in codes:
if "D:%s" % code in row:
return True
return False
def main():
codes = load_codes()
filtered = [row for row in data if matches(row, codes)]
for row in filtered:
print(row)
if __name__ == "__main__":
main()
Upvotes: 2
Reputation: 461
I don't think there is need to split each item in the data list You can simply do
data = [
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT',
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
with open("fields.txt") as f:
codes = f.read().splitlines()
required = []
for item in data:
for code in codes:
if "D:%s" %code in item:
required.append(item)
print(required)
You will end up with
[
'A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT',
'A:SET, B:IT, C:AS, D:+368399793, E:+12355'
]
Upvotes: 1
Reputation: 2838
I think this solution suits your need:
with open("fields.txt") as f:
codes = f.read().splitlines()
data = ['A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT', \
'A:SET, B:IT, C:AS, D:+22211111, E:+12355, F:ROOT', \
'A:SET, B:FW.O, C:AS, D:+177232, E:+12355', \
'A:SET, B:IT, C:AS, D:+368399793, E:+12355']
for index, item in enumerate(data):
sub_items =item.replace(" ", "").split(",") # to remove spaces and get each individual item
for sub_item in sub_items: # you can replace this for loop with sub_items[3] if the position of D: is fixed
if(sub_item.startswith("D:")):
value = sub_item.replace("D:", "") # here you have +xxxx in the data point
# you can apply the logic here:
for code in codes:
if value.startswith(code):
print(code, value, index, data[index])
It prints the following lines if fields.txt
contains the numbers you mentioned in the question:
+18 +18700000 0 A:SET, B:FW.O, C:AS, D:+18700000, E:+12355, F:ROOT
+36 +368399793 3 A:SET, B:IT, C:AS, D:+368399793, E:+12355
Upvotes: 1