Reputation: 193

output matching column from multiple input in awk

Assumes there are some data from these two input which I only want, which is "A" from inputA.txt and "B" from inputB.txt

==> inputA.txt <==
A 10214027 6369158
A 10214028 6369263
A 10214029 6369321
A 10214030 6369713
A 10214031 6370146
A 10214032 6370553
A 10214033 6370917
A 10214034 6371322
A 10214035 6371735
A 10214036 6372136

So I only want the data with A's

==> inputB.txt <==
B 50015214 5116941
B 50015215 5116767
B 50015216 5116577
B 50015217 5116409
B 50015218 5116221
B 50015219 5116044
B 50015220 5115845
B 50015221 5115676
B 50015222 5115512
B 50015223 5115326

Same goes here, only want B's

and I've built the script, but it's been doubled due to using multiple inputs.

#!/bin/awk -f
BEGIN{
    printf "Column 1\tColumn 2\tColumn 3"
}
/^A/{
    c=substr($2,1,4)
    d=substr($2,5,3)
    e=substr($3,1,4)
    f=substr($3,5,3)
}
{
    printf "%4.1f %4.1f %4.1f %4.1f\n",c,d,e,f > "outputA.txt"
} 
/^B/{
    c=substr($2,1,4)
    d=substr($2,5,3)
    e=substr($3,1,4)
    f=substr($3,5,3)
}
{
    printf "%4.1f %4.1f %4.1f %4.1f\n",c,d,e,f > "outputB.txt"
}

Let me know your thought on this.

Expected output

==> outputA.txt <==
Column 1 Column 2 Column 3 Column 4
1021 4027 6369 158
1021 4028 6369 263
1021 4029 6369 321
1021 4030 6369 713
1021 4031 6370 146
1021 4032 6370 553
1021 4033 6370 917
1021 4034 6371 322
1021 4035 6371 735
1021 4036 6372 136

==> outputB.txt <==
Column 1 Column 2 Column 3 Column 4
5001 5214 5116 941
5001 5215 5116 767
5001 5216 5116 577
5001 5217 5116 409
5001 5218 5116 221
5001 5219 5116 044
5001 5220 5115 845
5001 5221 5115 676
5001 5222 5115 512
5001 5223 5115 326

Upvotes: 3

Answers (4)

RARE Kpop Manifesto

Reputation: 2915

just keep it simple :

${...input_data...} |

{m,g,n}awk 'gsub(" ....", "& ")^_'

A 1021 4027 6369 158
A 1021 4028 6369 263
A 1021 4029 6369 321
A 1021 4030 6369 713
A 1021 4031 6370 146
A 1021 4032 6370 553
A 1021 4033 6370 917
A 1021 4034 6371 322
A 1021 4035 6371 735
A 1021 4036 6372 136
B 5001 5214 5116 941
B 5001 5215 5116 767
B 5001 5216 5116 577
B 5001 5217 5116 409
B 5001 5218 5116 221
B 5001 5219 5116 044
B 5001 5220 5115 845
B 5001 5221 5115 676
B 5001 5222 5115 512
B 5001 5223 5115 326

Upvotes: 0

oguz ismail

Reputation: 50805

You don't need substr here. Empty out the first field, insert a space after every four digits, force awk to reparse fields and then print:

awk '$1=="A"{
  $1=""
  gsub(/[0-9]{4}/,"& ")
  $1=$1
  print
}' inputA.txt

Its output:

1021 4027 6369 158
1021 4028 6369 263
1021 4029 6369 321
1021 4030 6369 713
1021 4031 6370 146
1021 4032 6370 553
1021 4033 6370 917
1021 4034 6371 322
1021 4035 6371 735
1021 4036 6372 136

Obviously this works with only one input but I believe referring to other answers you can tweak it to work with multiple files

Upvotes: 1

RavinderSingh13

Reputation: 133760

Could you please try following.

awk '
FNR==1{
  sub(/[a-z]+/,"",FILENAME)
  file="output"FILENAME".txt"
  print "Column 1 Column 2 Column 3 Column 4" > (file)
}
{
  print substr($0,3,4),substr($0,7,4),substr($0,12,4),substr($0,16,3) > (file)
}
'  inputA inputB

Explanation:

awk '                                                                                ##Starting awk program here.
FNR==1{                                                                              ##Checking condition if FNR==1, line number is 1 then do following.
  sub(/[a-z]+/,"",FILENAME)                                                          ##Substituting all small letters from file name with NULL.
  file="output"FILENAME".txt"                                                        ##Creating variable file whose value is string output FILENAME and .txt
  print "Column 1 Column 2 Column 3 Column 4" > (file)                               ##Printing headers to output file.
}
{
  print substr($0,3,4),substr($0,7,4),substr($0,12,4),substr($0,16,3) > (file)       ##Printing substrings values as per OP need to output files.
}
'  inputA inputB                                                                     ##Mentioning multiple Input_file names here.

Upvotes: 1

Cyrus

Reputation: 88969

With GNU awk and FIELDWIDTHS:

awk 'BEGIN{FIELDWIDTHS="1 1 4 4 1 4 3"}
     {out="output" $1 ".txt"} 
     FNR==1{print "Column 1 Column 2 Column 3 Column 4" >out}
     {print $3,$4,$6,$7 >out}' inputA.txt inputB.txt

Use FIELDWIDTHS to split current row to seven columns. out contains name of new file. If first row of current file is reached print header to new file. For every row print four columns to new file.

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

Upvotes: 3

output matching column from multiple input in awk

Answers (4)

Related Questions