manasa chandrashekar
manasa chandrashekar

Reputation: 13

AWK to display a column based on Column name and remove header and last delimiter

Id,responseId,name,test1,test2,bcid,stype
213,A_123456,abc,test,zzz,987654321,alpha
412,A_234566,xyz,test,xxx,897564322,gama
125,A_456314,ttt,qa,yyy,786950473,delta
222,A_243445,hds,test,fff,643528290,alpha
456,A_466875,sed,test,hhh,543819101,beta

I want to extract columns responseId, and bcid from above. I found an answer which is really close

awk -F ',' -v cols=responseID,bcid '(NR==1){n=split(cols,cs,",");for(c=1;c<=n;c++){for(i=1;i<=NF;i++)if($(i)==cs[c])ci[c]=i}}{for(i=1;i<=n;i++)printf "%s" FS,$(ci[i]);printf "\n"}' <file_name>

however, it prints "," in the end and the header as shown below.

responseId,bcid,
A_123456,987654321,
A_234566,897564322,
A_456314,786950473,
A_243445,643528290,
A_466875,543819101,

How can I make it to not print the header and the "," after bcid??

Upvotes: 1

Views: 1342

Answers (2)

Akshay Hegde
Akshay Hegde

Reputation: 16997

Input

$ cat infile
Id,responseId,name,test1,test2,bcid,stype
213, A_123456, abc, test, zzz, 987654321, alpha
412, A_234566, xyz, test, xxx, 897564322, gama
125, A_456314, ttt, qa, yyy, 786950473, delta
222, A_243445, hds, test, fff, 643528290, alpha
456, A_466875, sed, test, hhh, 543819101, beta

Script

$ cat byname.awk 
FNR==1{
    split(header,h,/,/);
    for(i=1; i in h; i++)
    {
        for(j=1; j<=NF; j++)
        {
            if(tolower(h[i])==tolower($j)){ d[i]=j; break } 
        }
    }
    next
}
{
    for(i=1; i in h; i++)
        printf("%s%s",i>1 ? OFS:"",  i in d ?$(d[i]):"");
    print "";
}

How to execute ?

$ awk -v FS=, -v OFS=, -v header="responseID,bcid" -f byname.awk  infile
 A_123456, 987654321
 A_234566, 897564322
 A_456314, 786950473
 A_243445, 643528290
 A_466875, 543819101

One-liner

$ awk -v FS=, -v OFS=, -v header="responseID,bcid" 'FNR==1{split(header,h,/,/);for(i=1; i in h; i++){for(j=1; j<=NF; j++){if(tolower(h[i])==tolower($j)){ d[i]=j; break }}}next}{for(i=1; i in h; i++)printf("%s%s",i>1 ? OFS:"",  i in d ?$(d[i]):"");print "";}' infile
 A_123456, 987654321
 A_234566, 897564322
 A_456314, 786950473
 A_243445, 643528290
 A_466875, 543819101

Upvotes: 3

RavinderSingh13
RavinderSingh13

Reputation: 133528

try:

awk '{NR==1?FS=",":FS=", ";$0=$0} {print $2 OFS $(NF-1)}' OFS=,   Input_file

Checking if line is 1st line then making delimiter as "," and other lines making field separator as ", " then printing the 2nd field and 2nd last field. Setting OFS(output field separator) as ,

Upvotes: 0

Related Questions