Reputation: 9
I am writing a script to pool lines from fileA when its first two columns are the same as fileB. The fileA is a txt file with three column separated by tab, and fileB is a txt file with two column separated by tab. But it keeps showing error as following: "site=a[0]+’\t’+a[1] IndexError: list index out of range" But I can print a[0], a[1], a[2] and site.
Here is my code:
fileB=open('AtoG_mock.txt').readlines()
fileA=open('depth_ice.txt').readlines()
outfile=open('AtoG_depth_ice.txt','w')
dict1={}
for line in fileA:
a=line.strip().split('\t')
site = a[0]+' '+a[1]
if site not in dict1:
dict1[site]=a[2]
for line in fileB:
b=line.strip().split('\t')
site=b[0]+' '+b[1]
if site in dict1:
outfile.write(b[0]+'\t'+b[1]+'\t'+dict1[site]+'\n')
outfile.close()
I would appreciate any help!
Upvotes: 0
Views: 96
Reputation: 1036
I've encountered this sometimes, blank lines for most of time. You may want to check before parsing.
if line == "": pass
The other possibility is missing data.
if len(a) != 3: print a
Upvotes: 0
Reputation: 336198
We can't give you a definitive answer because we don't have access to your data files, but you can easily debug this by wrapping the problematic line in a try/except
block:
try:
site = a[0]+' '+a[1]
except IndexError:
print("Error: a is", a)
raise
Most probably there's an empty line somewhere, often the last line.
Upvotes: 2