Reputation: 149
I am trying to read many files and from these files to write on specific column to another file. I read how it can be done but it is not working. Could someone help me implementing pawk in my script?
j = j + 1
#with open('a1_gather_{j}.txt'.format(j=j)) as f2:
f2 = open('a1_gather_{j}.txt'.format(j=j), 'w')
k=k+1
print k
f1 = open('a1_{k}'.format(k=k))
# with open('a1_{k}'.format(k=k), 'a') as f1:
lines = f1.readlines()
for i, line in enumerate(lines):
print i
if line.startswith(searchquery):
f2.write(line)
f2.write(lines[i + 1])
f2.write(lines[i + 2])
i = i+1
else :
i = i+1
#os.close(f1)
f1.close()
# awk '{a[FNR]=a[FNR]?a[FNR]" "$2:$2}END{for(i=1;i<=length(a);i++)print a[i]}' *
f2.close()
I preferred the f = open
instead of with open
to avoid the error IOError: (9, 'Bad file descriptor')
.
The files to read have 1000 lines and two columns. I need only the second column of each file to be written to another file.
Could someone correct my script and indicate how can pawk be used?
Upvotes: 0
Views: 122
Reputation: 9620
For this particular task, you could abandon python completely, and use the cut
command instead:
cut -f2 a1_{1..10}.txt > a1_gather.txt
By default, it uses tabs as column separator, -d
option lets you change that.
{start..stop}
notation gives you finer control over the files that get matched.
Upvotes: 1
Reputation: 149
f2 = open('a1_gather_{j}.txt'.format(j=j), 'w')
f1 = open('a1_{k}.txt'.format(k=k))
lines = f1.readlines()
for i, line in enumerate(lines):
print(repr(line))
f2.write(line)
i = i+1
f1.close()
f2.close()
This one reads and writes the whole lines. I can work with this also, but any suggestion for the knowledge how can one read and write only the second column is appreciated and wellcome.
Upvotes: 0
Reputation: 416
Assuming you need to write second column from a1_*
files to a1_gather
file and use awk as you mentioned in comment you could run simple command in terminal:
cat a1_* | awk '{print $2}' > a1_gather
Where:
cat a1_*
stands for from every file that filename starts with a1_
awk '{print $2}'
means print second column> a1_gather
is and save it to file a1_gather
In case you would be more interested using awk in the future, this is really useful tutorial to learn from.
Upvotes: 0