Reputation: 13
I would like to have a script to modify some large text files (100k records) such that, for every record, a number of lines in the output is created equivalent to the difference in columns 3 and 2 of every input line. In the output I want to print the record name (column 1), and a step-wise walk between the numbers contained in columns 2 and 3.
Sample trivial input could be (tab separated data, if it makes a difference)
a 3 5
b 10 14
with the desired output (again, ideally tab separated)
a 3 4
a 4 5
b 10 11
b 11 12
b 12 13
b 13 14
It's a challenge sadly beyond my (very) limited abilities.
Can anyone provide a solution to the problem, or point me in the right direction? In an ideal world I would be able to be integrate this into a bash script, but I'll take anything that works!
Upvotes: 1
Views: 101
Reputation: 241908
Bash solution:
while read h f t ; do
for ((i=f; i<t; i++)) ; do
printf "%s\t%d\t%d\n" $h $i $((i+1))
done
done < input.txt
Perl solution:
perl -lape '$_ = join "\n", map join("\t", $F[0], $_, $_ + 1), $F[1] .. $F[2] - 1' input.txt
Upvotes: 3
Reputation: 72657
Fully POSIX, and no unneeded loop variables:
$ while read h f t; do
while test $f -lt $t; do
printf "%s\t%d\t%d\n" "$h" $f $((++f))
done
done < input.txt
a 3 4
a 4 5
b 10 11
b 11 12
b 12 13
b 13 14
Upvotes: 0
Reputation: 246837
awk -F '\t' -v OFS='\t' '
$2 >= $3 {print; next}
{for (i=$2; i<$3; i++) print $1, i, i+1}
' filename
Upvotes: 0
Reputation: 77105
With awk
:
awk '$3!=$2 { while (($3 - $2) > 1) { print $1,$2,$2+1 ; $2++} }1' inputfile
Upvotes: 0