Reputation: 287
I have the following code which compares the items on the first column of input file1 with the contents of input file 2:
import os
newfile2=[]
outfile=open("outFile.txt","w")
infile1=open("infile1.txt", "r")
infile2=open("infile2.txt","r")
for file1 in infile1:
#print file1
file1=str(file1).strip().split("\t")
print file1[0]
for file2 in infile2:
if file2 == file1[0]:
outfile.write(file2.replace(file2,file1[1]))
else:
outfile.write(file2)
input file 1:
Modex_xxR_SL1344_3920 Modex_sseE_SL1344_3920
Modex_seA_hemN Modex_polA_SGR222_3950
Modex_GF2333_3962_SL1344_3966 Modex_ertd_wedS
input file 2:
Sardes_xxR_SL1344_4567
Modex_seA_hemN
MOdex_uui_gytI
Since the input file 1 item (column 1, row 2) matches an item in input file 2 (row 2), then the column 2 item in input file 1 replaces the input file 2 item in the output file as follows (required output):
Sardes_xxR_SL1344_4567
Modex_polA_SGR222_3950
MOdex_uui_gytI
So far my code is only outputting the items in input file 1. Can someone help modify this code. Thanks
Upvotes: 1
Views: 139
Reputation: 54253
Looks like you have a tsv
file, so let's go ahead and treat it as such. We'll build a tsv reader csv.reader(fileobj, delimiter="\t")
that will iterate through infile1
and build a translation dict from it. The dictionary will have keys of the first column and values of the second column per row.
Then using dict.get
we can translate the line from infile2
if it exists in our translation dict, or just write the line itself if there's no translation available.
import csv
with open("infile1.txt", 'r') as infile1,\
open('infile2.txt', 'r') as infile2,\
open('outfile.txt', 'w') as outfile:
trans_dict = dict(csv.reader(infile1, delimiter="\t"))
for line in infile2:
outfile.write(trans_dict.get(line.strip(),line.strip()) + "\n")
Result:
# contents of outfile.txt
Sardes_xxR_SL1344_4567
Modex_polA_SGR222_3950
MOdex_uui_gytI
EDIT as per your comment:
import csv
with open("infile1.txt", 'r') as infile1:
# build our translation dict
trans_dict = dict(csv.reader(infile1, delimiter="\t"))
with open("infile2.txt", 'r') as infile2,\
open("outfile.txt", 'w') as outfile:
# open the file to translate and our output file
reader = csv.reader(infile2, delimiter="\t")
# treat our file to translate like a tsv file instead of flat text
for line in reader:
outfile.write("\t".join([trans_dict.get(col, col) for col in line] + "\n"))
# map each column from trans_dict, writing the whole row
# back re-tab-delimited with a trailing newline
Upvotes: 2