Reputation:
Hii experts i have a single column of file having many blocks and the blocks are separated by > symbol.I want to resize all blocks to the same length by repeating the last number of the corresponding block.My file is given below:
file.txt
>
2.0
2.0
2.0
2.0
>
1.2
1.2
>
2.4
2.4
2.4
and my expected output is given below
>
2.0
2.0
2.0
2.0
>
1.2
1.2
1.2
1.2
>
2.4
2.4
2.4
2.4
my code is
#!/bin/sh
awk '$0==">" {
if (c && c>max)
max = c
++n
c = 0
next
}
{
r[n][++c] = $0
}
END {
for (i=1; i<=n; ++i) {
print ">"
for (j=1; j<=(max>c?max:c); ++j)
print (r[i][j] == "" ? "0.0" : r[i][j])
}
}' file
i copied the above code from github page, but it appends 0 instead of repeating last number.I hope some expert will help me. Thanks.
Upvotes: 2
Views: 79
Reputation: 23667
As mentioned in the comments, number of lines per block is fixed and the lines are all identical in a block. Here's a solution with perl
perl -ne 'print $_, <> x 4 if /^>$/' ip.txt
if /^>$/
checks if line content is >
print $_, <> x 4
will then print the current line and four times the next input lineIf the input file ends with >
line without further content, the above solution will not work. Use this instead:
perl -ne 'print $_, <> x 4 if /^>$/ && !eof' ip.txt
For small value of repetition, you can also use sed
(tested with GNU sed
, syntax might vary for other implementations)
sed -n '/^>$/{p; n; p; p; p; p}' ip.txt
sed -n '/^>$/{$!{p; n; p; p; p; p}}' ip.txt
Upvotes: 2
Reputation: 133538
1st solution: After seeing OP's comments looks like number of times a block should print is fixed or could be set in a variable in that case try following.
awk -v till="4" '
/^>/{
print
count=""
next
}
{
while(count++<till){
print
}
}
' Input_file
2nd solution(OP's code fix): Could you please try following, fixing your shown code here. This seems to be more generic where maximum number of blocks will be found and lines/values will be printed as per that.
awk '$0==">" {
if (c && c>max)
max = c
++n
c = 0
next
}
{
r[n][++c] = $0
}
END {
for (i=1; i<=n; ++i) {
print ">"
for (j=1; j<=(max>c?max:c); ++j){
print (r[i][j] == "" ? prev : r[i][j])
prev=r[i][j]==""?prev:r[i][j]
}
}
}' Input_file
Upvotes: 2
Reputation: 785286
This looks similar to a problem posted earlier.
Anyway, this awk
should work for you:
awk '$0==">"{if (c && c>max) max=c; ++n; c=0; next} {r[n][++c]=$0} END {for (i=1; i<=n; ++i) {print ">"; for (j=1; j<=(max>c?max:c); ++j) print (r[i][j] == "" ? r[i][1] : r[i][j])}}' file
>
2.0
2.0
2.0
2.0
>
1.2
1.2
1.2
1.2
>
2.4
2.4
2.4
2.4
To make it readable:
awk '$0==">"{
if (c && c>max)
max=c
++n
c=0
next
} {
r[n][++c]=$0
}
END {
for (i=1; i<=n; ++i) {
print ">"
for (j=1; j<=(max>c?max:c); ++j)
print (r[i][j] == "" ? r[i][1] : r[i][j])
}
}' file
Upvotes: 2
Reputation: 88674
With GNU awk:
awk -v n=4 '/^>/{print; getline; for(i=1; i<=n; i++) print}' file
If current row starts (^
) with >
print row and read next row (getline
) and then output this row n
times.
Output:
> 2.0 2.0 2.0 2.0 > 1.2 1.2 1.2 1.2 > 2.4 2.4 2.4 2.4
Upvotes: 1