Reputation: 59
I have a large txt file sample.txt
with over 54000 columns. They are ordered like this:
1011001 1 1001164 981328 1 -9 A G G G G G C C A . . . .
1011002 1 1001164 981328 1 -9 A G G G G G A C A . . . .
I need to re-order the columns as follows:
1 1011001 1001164 981328 1 -9 A G G G G G C C A . . . .
1 1011002 1001164 981328 1 -9 A G G G G G A C A . . . .
I.e I want the second column be the first one.
Is there some way for me to do this with Python?
Upvotes: 0
Views: 172
Reputation: 71620
List comprehension:
with open(filename,'r') as f:
l=[' '.join([i.split()[1],i.split()[0],i.split()[2]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
f.writelines(l)
Or in this case maybe:
with open(filename,'r') as f:
l=[' '.join([i.split()[1],i.split()[0],i.split()[2:]])+'\n' for i in f.readlines()]
with open(filename,'w') as f:
f.writelines(l)
Upvotes: 2
Reputation: 195623
With 54000 columns I would use regular expression, which is fast:
import re
with open('sample.txt', 'r') as f_in, open('sample_out.txt', 'w', newline='') as f_out:
for line in f_in.readlines():
g = re.findall(r'[^\s]+', line)
if g:
f_out.write(' '.join([g[1], g[0]] + g[2:]) + '\n')
Upvotes: 2
Reputation: 6748
Try this:
elements=[]
with open(filename,"r") as f:
for e in f.readlines():
line = e.split(" ")
line0 = line[0]
line[0] = line[1]
line[1] = line0
elements.append(" ".join(line))
with open(filename,"w") as f:
f.write("\n".join(elements))
Alternatively, if the above code crashes due to file size, you can do everything at once like this:
with open(filename,"r") as f:
with open(filename2,"w") as f2:
for e in f.readlines():
line = e.split(" ")
line0 = line[0]
line[0] = line[1]
line[1] = line0
f2.write(" ".join(line) + "\n")
... where filename2
is some other filename. Once you run the code, replace filename
with filename2
, and you are done.
Upvotes: 2