Arun Venkitusamy
Arun Venkitusamy

Reputation: 47

awk command to print multiple columns using for loop

I am having a single file in which it contains 1st and 2nd column with item code and name, then from 3rd to 12th column which contains its 10 days consumption quantity continuously. Now i need to convert that into 10 different files. In each the 1st and 2nd column should be the same item code and item name and the 3rd column will contain the consumption quantity of one day in each..

input file:

Code  | Name | Day1 | Day2 | Day3 |... 

10001 | abcd | 5 | 1 | 9 |...    
10002 | degg | 3 | 9 | 6 |...    
10003 | gxyz | 4 | 8 | 7 |...

I need the Output in different file as

file 1:

Code  | Name | Day1

10001 | abcd | 5   
10002 | degg | 3   
10003 | gxyz | 4   

file 2:

Code  | Name | Day2

10001 | abcd | 1   
10002 | degg | 9   
10003 | gxyz | 8  

file 3:

Code  | Name | Day3

10001 | abcd | 9   
10002 | degg | 6   
10003 | gxyz | 7 

and so on....

I wrote a code like this

awk 'BEGIN { FS = "\t" } ; {print $1,$2,$3}' FILE_NAME > file1;
awk 'BEGIN { FS = "\t" } ; {print $1,$2,$4}' FILE_NAME > file2;
awk 'BEGIN { FS = "\t" } ; {print $1,$2,$5}' FILE_NAME > file3;

and so on...

Now i need to write it with in a 'for' or 'while' loop which would be faster...

I dont know the exact code, may be like this..

for (( i=3; i<=NF; i++)) ; do awk 'BEGIN { FS = "\t" } ; {print $1,$2,$i}' input.tsv > $i.tsv; done

kindly help me to get the output as i explained.

Upvotes: 3

Views: 5672

Answers (3)

James Brown
James Brown

Reputation: 37464

In pure awk:

$ awk 'BEGIN{FS=OFS="|"}{for(i=3;i<=NF;i++) {f="file" (i-2); print $1,$2,$i >> f; close(f)}}' file

Explained:

$ awk '
BEGIN {
    FS=OFS="|" }             # set delimiters
{
    for(i=3;i<=NF;i++) {     # loop the consumption fields
        f="file" (i-2)       # create the filename
        print $1,$2,$i >> f  # append to target file
        close(f) }           # close the target file
}' file

Upvotes: 1

janos
janos

Reputation: 124824

If you absolutely need to to use a loop in Bash, then your loop can be fixed like this:

for ((i = 3; i <= 10; i++)); do awk -v field=$i 'BEGIN { FS = "\t" } { print $1, $2, $field }' input.tsv > file$i.tsv; done

But it would be really better to solve this using pure awk, without shell at all:

awk -v FS='\t' '
  NR == 1 {
    for (i = 3; i < NF; i++) {
      fn = "file" (i - 2) ".txt";
      print $1, $2, $i > fn;
      print "" >> fn;
    }
  }
  NR > 2 {
    for (i = 3; i < NF; i++) {
      fn = "file" (i - 2) ".txt";
      print $1, $2, $i >> fn;
    }
  }' inputfile

That is, when you're on the first record, create the output files by writing the header line and a blank line (as in specified in your question).

For the 3rd and later records, append to the files.

Note that the code in your question suggests that the fields in the file are separated by tabs, but the example files seem to use | padded with variable number of spaces. It's not clear which one is your actual case. If it's really tab-separated, then the above code will work. If in fact it's as the example inputs, then change the first line to this:

awk -v OFS=' | ' -v FS='[ |]+' '

Upvotes: 2

RomanPerekhrest
RomanPerekhrest

Reputation: 92904

bash + cut solution:

input.tsv test content:

Code | Name | Day1 | Day2 | Day3
10001 | abcd | 5 | 1 | 9
10002 | degg | 3 | 9 | 6
10003 | gxyz | 4 | 8 | 7

day_splitter.sh script:

#!/bin/bash

n=$(cat $1 | head -1 | awk -F'|' '{print NF}') # total number of fields
for ((i=3; i<=$n; i++))
do
    fn="Day"$(($i-2))  # file name containing `Day` number 
    $(cut -d'|' -f1,2,$i $1 > $fn".txt")
done

Usage:

bash day_splitter.sh input.tsv

Results:

$cat Day1.txt
Code | Name | Day1 
10001 | abcd | 5 
10002 | degg | 3 
10003 | gxyz | 4 

$cat Day2.txt
Code | Name | Day2 
10001 | abcd | 1 
10002 | degg | 9 
10003 | gxyz | 8

$cat Day3.txt
Code | Name | Day3
10001 | abcd | 9
10002 | degg | 6
10003 | gxyz | 7

Upvotes: 2

Related Questions