Reputation: 33
I have a file that, occasionally, has split lines.
The split is signaled by the fact that two consecutive lines with Alphabetic characters.
5 00:00:00,000 --> 00:00:00,000 Alphabetic characters Alphabetic characters 6 00:00:00,000 --> 00:00:00,000 Alphabetic characters 7 00:00:00,000 --> 00:00:00,000 Alphabetic characters Alphabetic characters 8 00:00:00,000 --> 00:00:00,000 Alphabetic characters .....
I'd like join the split line back:
5 00:00:00,000 --> 00:00:00,000 Alphabetic characters Alphabetic characters 6 00:00:00,000 --> 00:00:00,000 Alphabetic characters 7 00:00:00,000 --> 00:00:00,000 Alphabetic characters Alphabetic characters 8 > 00:00:00,000 --> 00:00:00,000 Alphabetic characters Alphabetic characters .....
using sed. I'm not clear how to join a line with the preceeding one. Any suggestion?
Upvotes: 0
Views: 114
Reputation: 203712
sed is for simple subsitutions on individual lines, that is all. For anything else you should be using awk:
$ awk '/[[:alpha:]]/{ if (buf=="") {buf=$0; next} else {$0=buf OFS $0; buf=""} } 1' file
5
00:00:00,000 --> 00:00:00,000
Alphabetic characters Alphabetic characters
6
00:00:00,000 --> 00:00:00,000
7
00:00:00,000 --> 00:00:00,000
Alphabetic characters Alphabetic characters
8
00:00:00,000 --> 00:00:00,000
Alphabetic characters Alphabetic characters
.....
The above will work robustly, portably, and efficiently on all UNIX systems with all POSIX-compatible awks.
Upvotes: 1
Reputation: 15461
Another approach with sed:
sed '/^[[:alpha:]]/{N;/\n[[:alpha:]]/s/\n/ /}' file
When a line starting with alphabetic characters is found, add next line to the pattern space using the N
command. Then replace newline when followed by alphabetic characters with a space.
Upvotes: 1
Reputation: 1030
sed '$!{N;/^[a-zA-Z ][^\n]\+\n[a-zA-Z ]/{s/\n/ /}}'
Match two lines back-to-back that meet the condition that the first line starts with an alphabetic character or space, and the second starts with the same. Join them with a space.
Upvotes: 1