Reputation: 173
I have two files, one is user input f1, and other one is database f2.I want to search if strings from f1 are in database(f2). If not print the ones that don't exist if f2. I have problem with my code, it is not working fine: Here is f1:
rbs003491
rbs003499
rbs003531
rbs003539
rbs111111
Here is f2:
AHPTUR13,rbs003411
AHPTUR13,rbs003419
AHPTUR13,rbs003451
AHPTUR13,rbs003459
AHPTUR13,rbs003469
AHPTUR13,rbs003471
AHPTUR13,rbs003479
AHPTUR13,rbs003491
AHPTUR13,rbs003499
AHPTUR13,rbs003531
AHPTUR13,rbs003539
AHPTUR13,rbs003541
AHPTUR13,rbs003549
AHPTUR13,rbs003581
In this case it would return rbs11111
, because it is not in f2.
Code is:
with open(c,'r') as f1:
s1 = set(x.strip() for x in f1)
print s1
with open("/tmp/ARNE/blt",'r') as f2:
for line in f2:
if line not in s1:
print line
Upvotes: 0
Views: 135
Reputation: 368904
If you only care about the second part of each line (rbs003411
from AHPTUR13,rbs003411
):
with open(user_input_path) as f1, open('/tmp/ARNE/blt') as f2:
not_found = set(f1.read().split())
for line in f2:
_, found = line.strip().split(',')
not_found.discard(found) # remove found word
print not_found
# for x in not_found:
# print x
Upvotes: 1
Reputation: 107287
you need to check the last part of your lines not all of them , you can split your lines from f2 with ,
then choose the last part (x.strip().split(',')[-1]
) , Also if you want to search if strings from f1 are in database(f2) your LOGIC here is wrong you need to create your set from f2
:
with open(c,'r') as f1,open("/tmp/ARNE/blt",'r') as f2:
s1 = set(x.strip().split(',')[-1] for x in f2)
print s1
for line in f1:
if line.strip() not in s1:
print line
Upvotes: 0
Reputation: 127
Your line
variable in the for loop will contain something like "AHPTUR13,rbs003411", but you are only interested in the second part. You should do something like:
for line in f2:
line = line.strip().split(",")[1]
if line not in s1:
print line
Upvotes: 0