Reputation: 3
I have about more than 500 files having two columns "Gene short name" and "FPKM" values. The number of rows is same and the "Gene short name" column is common in all the files. I want to create a matrix by keeping first column as gene short name (can be taken from any of the files) and rest other columns having the FPKM.
I have used this command which works well, but then, how can I use it for 500 files?
paste -d' ' <(awk -F'\t' '{print $1}' 69_genes.fpkm.txt) \
<(awk -F'\t' '{print $2}' 69_genes.fpkm.txt) \
<(awk -F'\t' '{print $2}' 72_genes.fpkm.txt) \
<(awk -F'\t' '{print $2}' 75_genes.fpkm.txt) \
<(awk -F'\t' '{print $2}' 78_genes.fpkm.txt) > col.txt
sample data (files are tab separated):
head 69_genes.fpkm.txt
gene_short_name FPKM
DDX11L1 0.196141
MIR1302-2HG 0.532631
MIR1302-2 0
WASH7P 4.51437
Expected outcome
gene_short_name FPKM FPKM FPKM FPKM
DDX11L1 0.196141 0.206591 0.0201256 0.363618
MIR1302-2HG 0.532631 0.0930007 0.0775838 0
MIR1302-2 0 0 0 0
WASH7P 4.51437 3.31073 3.23326 1.05673
MIR6859-1 0 0 0 0
FAM138A 0.505155 0.121703 0.105235 0
OR4G4P 0.0536387 0 0 0
OR4G11P 0 0 0 0
OR4F5 0.0390888 0.0586067 0 0
Also, I want to change the name "FPKM" to "filename_FPKM".
Upvotes: 0
Views: 943
Reputation: 37464
In awk, using @Micha's data for clarity:
$ awk '
BEGIN { FS=OFS="\t" } # set the field separators
FNR==1 {
$2=FILENAME "_" $2 # on first record of each file rename $2
}
NR==FNR { # process the first file
a[FNR]=$0 # hash whole record to a
next
}
{ # process other files
a[FNR]=a[FNR] OFS $2 # add $2 to the end of the record
}
END { # in the end
for(i=1;i<=FNR;i++) # print all records
print a[i]
}' a.txt b.txt c.txt
Output:
a a.txt_1 b.txt_I c.txt_one
b 2 II two
c 3 III three
Upvotes: 0
Reputation: 20873
Given the input
$ cat a.txt
a 1
b 2
c 3
$ cat b.txt
a I
b II
c III
$ cat c.txt
a one
b two
c three
you can loop:
cut -f1 a.txt > result.txt
for f in a.txt b.txt c.txt
do
cut -f2 "$f" | paste result.txt - > tmp.txt
mv {tmp,result}.txt
done
$ cat result.txt
a 1 I one
b 2 II two
c 3 III three
Upvotes: 1