Jeni
Jeni

Reputation: 968

Conserve header while joining files in bash

I have this 2 tab separated files:

fileA.tsv

probeId    sample1_betaval    sample2_betaval    sample3_betaval
   a              1                  2                  3
   b              4                  5                  6
   c              7                  8                  9

fileB.tsv

probeId       region      gene
   a         intronic      tp53
   b         non-coding     NA 
   c         exonic         kras

As they are already sorted by probeId, I've merged both files:

join -j 1 fileA.tsv fileB.tsv -t $'\t' > complete.tsv

The problem is that the output does not conserve headers:

   a              1                  2                  3           intronic    tp53
   b              4                  5                  6           non-coding   NA
   c              7                  8                  9            exonic      kras

While my desired output is:

probeId    sample1_betaval    sample2_betaval    sample3_betaval     region     gene
   a              1                  2                  3           intronic    tp53
   b              4                  5                  6           non-coding   NA
   c              7                  8                  9            exonic      kras

How can I achieve that?

Upvotes: 1

Views: 300

Answers (2)

Quasímodo
Quasímodo

Reputation: 4004

Add --header option if your join provides it:

join --header -j 1 fileA.tsv fileB.tsv -t $'\t' > complete.tsv

Upvotes: 3

RavinderSingh13
RavinderSingh13

Reputation: 133600

Could you please try following(in case you are ok with it).

awk '
FNR==NR{
  array[$1]=$0
  next
}
($1 in array){
  print array[$1],$2,$3
}
'  filea  fileb | column -t


EDIT: In case OP has many columns in fileb and want to print all apart from 1st column then try following.

awk '
FNR==NR{
  array[$1]=$0
  next
}
($1 in array){
  val=$1
  $1=""
  sub(/^ +/,"")
  print array[val],$0
}
'  filea  fileb | column -t

Upvotes: 3

Related Questions