Baklap4
Baklap4

Reputation: 4202

Regex: multiple occurence of new line Tabbed outcome

I'm having a dataset which i need to get the values from in one line tabbed from eachother.

Say i have this dataset:

test
pizza

pudding
cheese


Newt
somethingelse

otherstuf


pokemon
somedate
derp

Notice the difference between the 2 and 1 new lines.

When there are 2 new lines a new row will be made When there is 1 new line that "value" will be in the same row as empty value. This dataset would become this:

test    pizza         pudding    cheese
Newt    somethingelse    otherstuf
pokemon    somedate     derp

Again notice the first line in this example where the empty row between pizza and pudding is 1 new line instead of 2.

I've tried matching on a new line with: ^\n and replace with \t but this would get me everything tabbed on one line which is not what i want.. I'm using sublime for this.

Upvotes: 0

Views: 35

Answers (3)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89629

Choose the good newline sequence (according to your file) and use two passes. Example with a Windows newline sequence:

Use two passes:

  • replace \r\n with \t
  • replace \t\t\t with \r\n

Upvotes: 0

revo
revo

Reputation: 48751

When you talk about a new-line you mean a blank-line otherwise reaching a blank-line needs checking for two new-line characters. That's true about 2 new-lines as well.

Find: (?<!\s)\n(?=\S)|\n{2}

Replace with: \t

enter image description here

Just for showing off my template.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627341

I suggest

(.)\R{1,2}+(?!\R)

and replace with $1\t. This way, you will match 1 to 2 linebreaks only and replace them with a tab. A (.) will make sure there is some data on the line before the first linebreak.

enter image description here

Pattern details:

  • (.) - Group 1 capturing a character other than a newline
  • \R{1,2}+ - 1 or 2 linebreaks...
  • (?!\R) - ...that are not followed with a linebreak.

If you allow merging empty lines, you may try

(?<!\n)\R{1,2}+(?!\R)

and replace with a \t.

Then, to replace 3 linebreaks with one, use

\R{3}

and replace with \r\n or \n, or \r, depending on your OS/requirements.

Upvotes: 1

Related Questions