user14447651
user14447651

Reputation:

how to repeat the last number of the corresponding block of a column

Hii experts i have a single column of file having many blocks and the blocks are separated by > symbol.I want to resize all blocks to the same length by repeating the last number of the corresponding block.My file is given below:

file.txt

>
2.0
2.0
2.0
2.0
>
1.2
1.2
>
2.4
2.4
2.4

and my expected output is given below

>
2.0
2.0
2.0
2.0
>
1.2
1.2
1.2
1.2
>
2.4
2.4
2.4
2.4

my code is

#!/bin/sh
awk '$0==">" {
   if (c && c>max)
      max = c
   ++n
   c = 0
   next
}
{
   r[n][++c] = $0
}
END {
   for (i=1; i<=n; ++i) {
      print ">"
      for (j=1; j<=(max>c?max:c); ++j)
         print (r[i][j] == "" ? "0.0" : r[i][j])
   }
}' file

i copied the above code from github page, but it appends 0 instead of repeating last number.I hope some expert will help me. Thanks.

Upvotes: 2

Views: 79

Answers (4)

Sundeep
Sundeep

Reputation: 23667

As mentioned in the comments, number of lines per block is fixed and the lines are all identical in a block. Here's a solution with perl

perl -ne 'print $_, <> x 4 if /^>$/' ip.txt
  • if /^>$/ checks if line content is >
    • print $_, <> x 4 will then print the current line and four times the next input line

If the input file ends with > line without further content, the above solution will not work. Use this instead:

perl -ne 'print $_, <> x 4 if /^>$/ && !eof' ip.txt

For small value of repetition, you can also use sed (tested with GNU sed, syntax might vary for other implementations)

sed -n '/^>$/{p; n; p; p; p; p}' ip.txt
sed -n '/^>$/{$!{p; n; p; p; p; p}}' ip.txt

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133538

1st solution: After seeing OP's comments looks like number of times a block should print is fixed or could be set in a variable in that case try following.

awk -v till="4" '
/^>/{
  print
  count=""
  next
}
{
  while(count++<till){
    print
  }
}
' Input_file


2nd solution(OP's code fix): Could you please try following, fixing your shown code here. This seems to be more generic where maximum number of blocks will be found and lines/values will be printed as per that.

awk '$0==">" {
   if (c && c>max)
      max = c
   ++n
   c = 0
   next
}
{
   r[n][++c] = $0
}
END {
   for (i=1; i<=n; ++i) {
      print ">"
      for (j=1; j<=(max>c?max:c); ++j){
         print (r[i][j] == "" ? prev : r[i][j])
         prev=r[i][j]==""?prev:r[i][j]
      }
   }
}' Input_file

Upvotes: 2

anubhava
anubhava

Reputation: 785286

This looks similar to a problem posted earlier.

Anyway, this awk should work for you:

awk '$0==">"{if (c && c>max) max=c; ++n; c=0; next} {r[n][++c]=$0} END {for (i=1; i<=n; ++i) {print ">"; for (j=1; j<=(max>c?max:c); ++j) print (r[i][j] == "" ? r[i][1] : r[i][j])}}' file

>
2.0
2.0
2.0
2.0
>
1.2
1.2
1.2
1.2
>
2.4
2.4
2.4
2.4

To make it readable:

awk '$0==">"{
   if (c && c>max)
       max=c
   ++n
   c=0
   next
} {
   r[n][++c]=$0
}
END {
   for (i=1; i<=n; ++i) {
      print ">"
      for (j=1; j<=(max>c?max:c); ++j)
         print (r[i][j] == "" ? r[i][1] : r[i][j])
   }
}' file

Upvotes: 2

Cyrus
Cyrus

Reputation: 88674

With GNU awk:

awk -v n=4 '/^>/{print; getline; for(i=1; i<=n; i++) print}' file

If current row starts (^) with > print row and read next row (getline) and then output this row n times. Output:

>
2.0
2.0
2.0
2.0
>
1.2
1.2
1.2
1.2
>
2.4
2.4
2.4
2.4

Upvotes: 1

Related Questions