user1042891
user1042891

Reputation: 43

join 2 consecutive rows under condition

I have 5 lines like:

typeA;pointA1
typeA;pointA2
typeA;pointA3
typeB;pointB1
typeB;pointB2

result output would be:

typeA;pointA1;typeA;pointA2
typeA;pointA2;typeA;pointA3
typeB;pointB1;typeB;pointB2

Is it possible to use sed or awk for this purpose?

Upvotes: 1

Views: 268

Answers (3)

potong
potong

Reputation: 58568

This GNU sed solution might work for you:

 sed -rn '1{h;b};H;x;/^([^;]*);.*\n\1/!{s/.*\n//;x;d};s/\n/;/p' source_file

Assumes no blank lines else pipe preformat the source file with sed '/^$/d' source_file

EDIT:

On reflection the above solution is far too elaborate and can be condensed to:

 sed -ne '1{h;b};H;x;/^\([^;]*\);.*\1/s/\n/;/p' source_file

Explanation:

The -n prevents any lines being implicitly printed. The first line is copied to the hold space (HS an extra register) and then a break is made that ends the iteration. All subsequent lines are appended to the HS. The HS is then swapped with the pattern space (PS - a register holding the current line). The HS at this point contains the previous and current lines which are now checked to see if the first field in each line are identical. If so, the newline separating the two lines is replaced by a ; and providing the substitution occurred the PS is printed out. The next iteration now takes place, the current line refreshes the PS and HS now holds the previous line.

Upvotes: 0

Kent
Kent

Reputation: 195239

paste could be useful in this case. it could save a lot of codes:

sed '1d' file|paste -d";" file -|awk -F';' '$1==$3'

see the test below

kent$  cat a
typeA;pointA1
typeA;pointA2
typeA;pointA3
typeB;pointB1
typeB;pointB2

kent$  sed '1d' a|paste -d";" a -|awk -F';' '$1==$3'
typeA;pointA1;typeA;pointA2
typeA;pointA2;typeA;pointA3
typeB;pointB1;typeB;pointB2

Upvotes: 1

Michael J. Barber
Michael J. Barber

Reputation: 25052

This is easy with awk:

awk -F';' '$1 == prevType { printf("%s;%s;%s\n", $1, prevPoint, $0) } { prevType = $1; prevPoint = $2 }'

I've assumed that the blank lines between the records are not part of the input; if they are, just run the input through grep -v '^$' before awk.

Upvotes: 2

Related Questions