Reputation: 651
I have a list that looks something like this:
Item 1
Subitem 1
Item 2
Item 3
Subitem 1
Subitem 2
Subsubitem 1
Item 4
Pretty much, every top-level item has one newline before it, and each subitem has two newlines, and sub-subitems have three, and so on. I want it in a format similar to this:
Item 1
Subitem 1
Item 2
Item 3
Subitem 1
Subitem 2
Subsubitem 1
Item 4
The regex I have been using in vim is this:
For the first level:
%s/^$\n\(\t\w\)/\t\1/g
For the second level:
%s/^$\n\(\t\t\w\)/\t\1/g
and so on.
What's the better way to do this without having to run a different regex for each level of the list? I'm trying to use vim to do this, but any *nix solution is fine with me.
Upvotes: 0
Views: 84
Reputation: 45087
This can be accomplished with :s
and sub-replace-expression (\=
).
:%s/^\n\+/\=repeat("\t",len(submatch(0))-1)/
Basically we count the number of \n
's and replace them with the same number of \t
's.
:%s/^\n\+/.../g
find our sequence of \n
's%s/.../\={expr}/g
replace the match with the evaluation of expression, {expr}
.submatch(0)
get the n'th submatch. Same as \0
or &
in this case.repeat({str}, {num})
returns a string, {str}
repeated {num}
times.len({str})
get length of string, {str}
.len(submatch(0))-1
decrement length as we want to keep the "good lines" on separate lines.For more help see:
:h :s
:h sub-replace-expression
:h :repeat()
:h :len()
:h submatch()
Upvotes: 1
Reputation: 5851
The Perl way:
perl -0777pe 's/\n\K\n+/"\t"x(-1+length $&)/gse'
Using tr
and GNU sed
:
tr '\n' '\t' | sed -E 's/([^\t])\t\t/\1\n/g'
Output:
Item 1
Subitem 1
Item 2
Item 3
Subitem 1
Subitem 2
Subsubitem 1
Item 4
Upvotes: 1
Reputation: 401
One thing that you can do is to recursively use the following regex :
(?<!\n)\n\t*\n
Recursively find and replace all the occurrence of this regex
...and so on until there is no match for the regex anywhere.
So you don't have to run a different regex every time, but still you'll have to change the replace with part. You can write a small program to recursively do it.
Upvotes: 0
Reputation: 11
That depends on what is executing the regular expression. E.g. Sed won't do the trick as it parses lines. If you are using sed, try to replace it with tr:
tr '\n' '\t'
Upvotes: 1