Reputation: 869
I have files beginning with values I'd like to match in contiguous groups, then remove the new line characters between them. "Contiguous groups" meaning that we only want to remove newlines between pairs of matching lines. Diffs provide a handy example: Say we wanted to remove new line character between added lines, i.e. all lines beginning with a plus sign +
.
Adapting an answer to question gets close, but only groups pairs, instead of continuing to group all the following match lines:
sed '/^+/N;s/\n+/ /' path/to/file.diff
(Note that the input and expected output also concatenates lines beginning with a space. That was a formatting mistake on my part, and being that very helpful answers have been written to answer the intent expressed in the output, I'm leaving as is so as not to invalidate them.)
Example input:
--- some/file/path 2021-02-21 16:33:40.000000000 -0600
+++ another/file/path 2021-02-21 16:33:52.000000000 -0600
@@ -32,7 +32,7 @@
this
sentence
-lost
-many
+gained
+several
+other
words
@@ -91,9 +91,10 @@
this
one
-just
-lost
-many
Desired output:
--- some/file/path 2021-02-21 16:33:40.000000000 -0600
+++ another/file/path 2021-02-21 16:33:52.000000000 -0600
@@ -32,7 +32,7 @@
this sentence
-lost
-many
+gained several other
words
@@ -91,9 +91,10 @@
this one
-just
-lost
-many
Upvotes: 1
Views: 90
Reputation: 84569
This awk
solution stretches the notion of a 1-liner a bit, but it isn't too bad by long one liner standards, e.g.
awk '
!found && /^+[^+]/ { printf "%s", $0; found=1; next }
/^[^+]/ { printf (found?"\n%s\n":"%s\n"), $0; found=0; next }
found { printf " %s", substr($0,2); next }
{ print }
' file
Example Use/Output
With your input in the file creatively named file
, you can select-copy and middle-mouse-paste into an xterm with the file in the current directory and would have:
$ awk '
> !found && /^+[^+]/ { printf "%s", $0; found=1; next }
> /^[^+]/ { printf (found?"\n%s\n":"%s\n"), $0; found=0; next }
> found { printf " %s", substr($0,2); next }
> { print }
> ' file
--- some/file/path 2021-02-21 16:33:40.000000000 -0600
+++ another/file/path 2021-02-21 16:33:52.000000000 -0600
@@ -32,7 +32,7 @@
this
sentence
-lost
-many
+gained several other
words
@@ -91,9 +91,10 @@
this
one
-just
-lost
-many
Note: your problem statement discusses just concatenating lines beginning with '+'
, but your expected output also joins the first two lines after the diff
position information. it is unclear if you want one, the other or both?
Upvotes: 1
Reputation: 58463
This might work for you (GNU sed):
sed -E ':a;N;s/^(([+ ]).*)\n\2/\1 /;$!ta;P;D' file
Append the following line.
If the first line begins with +
or
and the second line does with the same character, remove the newline and the repeated character and replace them by a space.
Repeat the process until a match fails.
Print/delete the first line and repeat.
Upvotes: 2
Reputation: 67497
something like this should work
$ awk '{p=substr($0,1,1);
if(p!=pp && pp!="-") printf "\n";
pp=p;
printf "%s%s",$0,p=="-"?"\n":""}' file
--- some/file/path 2021-02-21 16:33:40.000000000 -0600
+++ another/file/path 2021-02-21 16:33:52.000000000 -0600
@@ -32,7 +32,7 @@
this sentence
-lost
-many
+gained+several+other
words
@@ -91,9 +91,10 @@
this one
-just
-lost
-many
Upvotes: 0
Reputation: 10133
A sed
one-liner which will join adjacent lines beginning with a +
character:
sed -e ':a' -e '$!N;s/^\(+.*\)\n+/\1 /;ta' -e 'P;D' file
Upvotes: 1