Reputation: 2097
I have a three-column data file, and I want to do some transformation on the data for plot using bash. Notice it is not always withop
first. In sometimes, the noop
row can be first
The sample data is:
printff withop 1
printff noop 0
partial_sums withop 1
partial_sums noop 1
fasta noop 1
fasta withop 1
word_anagrams withop 2
word_anagrams noop 2
list noop 0
list withop 8
gc_mb withop 1
gc_mb noop 1
simple_connect withop 0
simple_connect noop 0
binary_trees noop 2
binary_trees withop 2
cal noop 3
cal withop 6
The transformation I want is to merge every two rows with same value of first column. The new format is still three columns, and the second column is withop and the third is noop. For example, the new data is:
printff 1 0
partial_sums 1 1
....
list 8 0
...
Upvotes: 2
Views: 364
Reputation: 437953
If you can rely on related lines coming in pairs, here's a single-pass awk
solution:
awk '{
op1=$2; val1=$3
getline
val2=$3
print $1 " " (op1 == "withop" ? val1 " " val2 : val2 " " val1)
}' file
op1=$2; val1=$3
reads the operation field ($2
, the second whitespace-separated field) into var. op1
, and the value field ($3
, the third field) into var. val1
.getline
reads the next line from the input file, which causes its fields to be reflected in $1
, ...
getline
is fine in this particular case - the lines can assumed to be paired - it has many pitfalls, and its use is rarely the right choice - see http://awk.info/?tip/getlineval2=$3
then stores the second line's value field in var. val2
.
print $1 " " (op1 == "withop" ? val1 " " val2 : val2 " " val1)
then prints a single output line for the two lines at hand:
$1
, the 1st field, is by definition the same on both lines, so we can use the 2nd line's value.(op1 == "withop" ? val1 " " val2 : val2 " " val1)
is a C-style ternary operator (inline conditional) that simply either prints the 1st line's value before the 2nd line's or vice versa, depending on whether the first line's operation field was withop
or not.Upvotes: 1
Reputation: 2863
Assuming your data is always correct:
cat data.txt | sort | while read a b c && read d e f; do echo $a $c $f; done
Update: sorting added, as whithop and noop rows can be in any order.
Upvotes: 0