user1269741
user1269741

Reputation: 447

transpose column and rows using gawk

I am trying to transpose a really long file and I am concerned that it will not be transposed entirely.

My data looks something like this:

Thisisalongstring12345678   1   AB  abc 937 4.320194
Thisisalongstring12345678   1   AB  efg 549 0.767828
Thisisalongstring12345678   1   AB  hi  346 -4.903441
Thisisalongstring12345678   1   AB  jk  193 7.317946

I want my data to look like this:

Thisisalongstring12345678 Thisisalongstring12345678 Thisisalongstring12345678 Thisisalongstring12345678
1                         1                         1                         1
AB                        AB                        AB                        AB
abc                       efg                       hi                        jk
937                       549                       346                       193
4.320194                  0.767828                  -4.903441                 7.317946

Would the length of the first string prove to be an issue? My file is much longer than this approx 2000 lines long. Also is it possible to change the name of the first string to Thisis234, and then transpose?

Upvotes: 5

Views: 6914

Answers (4)

JeffZheng
JeffZheng

Reputation: 1405

For @ ScubaFishi and @ icyrock code:

"if (max_nf < NF)" seems unnecessary. I deleted it, and the code works just fine.

Upvotes: 0

ScubaFish
ScubaFish

Reputation: 41

I tried icyrock.com's answer, but found that I had to change:

for(r = 1; r <= NR; r++) {
  for(c = 1; c <= max_nf; c++) {

to

for(r = 1; r <= max_nf; r++) {
  for(c = 1; c <= NR; c++) {

to get the NR columns and max_nf rows. So icyrock's code becomes:

$ cat mkt.sh
awk '
{
  for(c = 1; c <= NF; c++) {
    a[c, NR] = $c
  }
  if(max_nf < NF) {
    max_nf = NF
  }
}
END {
  for(r = 1; r <= max_nf; r++) {
    for(c = 1; c <= NR; c++) {
      printf("%s ", a[r, c])
    }
    print ""
  }
}
' inf.txt

If you don't do that and use an asymmetrical input, like:

a b c d
1 2 3 4
. , + -

You get:

a 1 .
b 2 ,
c 3 +

i.e. still 3 rows and 4 columns (the last of which is blank).

Upvotes: 4

Kaz
Kaz

Reputation: 58578

This can be done with the rs BSD command:

http://www.unix.com/man-page/freebsd/1/rs/

Check out the -T option.

Upvotes: 7

icyrock.com
icyrock.com

Reputation: 28608

I don't see why it will not be - unless you don't have enough memory. Try the below and see if you run into problems.

Input:

$ cat inf.txt 
a b c d
1 2 3 4
. , + -
A B C D

Awk program:

$ cat mkt.sh
awk '
{
  for(c = 1; c <= NF; c++) {
    a[c, NR] = $c
  }
  if(max_nf < NF) {
    max_nf = NF
  }
}
END {
  for(r = 1; r <= NR; r++) {
    for(c = 1; c <= max_nf; c++) {
      printf("%s ", a[r, c])
    }
    print ""
  }
}
' inf.txt

Run:

$ ./mkt.sh 
a 1 . A 
b 2 , B 
c 3 + C 
d 4 - D 

Credits:

Hope this helps.

Upvotes: 7

Related Questions