GeMa
GeMa

Reputation: 149

python write a specific column from many files to one file

I am trying to read many files and from these files to write on specific column to another file. I read how it can be done but it is not working. Could someone help me implementing pawk in my script?

    j = j + 1   
    #with open('a1_gather_{j}.txt'.format(j=j)) as f2:  
    f2 = open('a1_gather_{j}.txt'.format(j=j), 'w')
        k=k+1
        print k 
        f1 = open('a1_{k}'.format(k=k))
        # with open('a1_{k}'.format(k=k), 'a') as f1:
        lines = f1.readlines()
        for i, line in enumerate(lines):
            print i
            if line.startswith(searchquery):
                f2.write(line)
                f2.write(lines[i + 1])
                f2.write(lines[i + 2])
                i = i+1
            else :
                i = i+1
        #os.close(f1)
        f1.close()

# awk '{a[FNR]=a[FNR]?a[FNR]" "$2:$2}END{for(i=1;i<=length(a);i++)print a[i]}' *

f2.close()

I preferred the f = open instead of with open to avoid the error IOError: (9, 'Bad file descriptor').

The files to read have 1000 lines and two columns. I need only the second column of each file to be written to another file.

Could someone correct my script and indicate how can pawk be used?

Upvotes: 0

Views: 122

Answers (3)

arekolek
arekolek

Reputation: 9620

For this particular task, you could abandon python completely, and use the cut command instead:

cut -f2 a1_{1..10}.txt > a1_gather.txt

By default, it uses tabs as column separator, -d option lets you change that.

{start..stop} notation gives you finer control over the files that get matched.

Upvotes: 1

GeMa
GeMa

Reputation: 149

f2 = open('a1_gather_{j}.txt'.format(j=j), 'w')
   f1 = open('a1_{k}.txt'.format(k=k))
   lines = f1.readlines()
   for i, line in enumerate(lines):       
       print(repr(line))                
       f2.write(line)
       i = i+1
   f1.close()
f2.close()

This one reads and writes the whole lines. I can work with this also, but any suggestion for the knowledge how can one read and write only the second column is appreciated and wellcome.

Upvotes: 0

Patryk Perduta
Patryk Perduta

Reputation: 416

Assuming you need to write second column from a1_* files to a1_gather file and use awk as you mentioned in comment you could run simple command in terminal:

cat a1_* | awk '{print $2}' > a1_gather

Where:

  • cat a1_* stands for from every file that filename starts with a1_
  • awk '{print $2}' means print second column
  • > a1_gather is and save it to file a1_gather

In case you would be more interested using awk in the future, this is really useful tutorial to learn from.

Upvotes: 0

Related Questions