TheNovice
TheNovice

Reputation: 1297

Using VSCode regex, match and remove multiple lines

I have a massive text document (using VS Code) that looks like this and continues in the same pattern for several thousand lines. In essence, we have a integer, a float that always starts with 0.00 and then four blank lines:

468653564
0.0013348548




160919876
0.0015948548




239109587
0.0010948548




190959199
0.0023948548




163220290
0.001348548

How would I format this document to remove the blank lines and the float, so I end up with something that looks like this:

468653564
160919876
239109587
190959199
163220290

This pattern seems to work fine for the first step (0.00.*) and this ^$\n for the second, but is there a way to get it all in one fell swoop?

Upvotes: 3

Views: 4640

Answers (3)

l'L'l
l'L'l

Reputation: 47302

To handle multiple regex patterns in one go simply include an "or" statement (|) separating them:

0\.00.*\n|^$\n

So this essentially says look for 0.00... OR blank lines.

A slightly more efficient pattern might be to look for digits \d (without being specific to which ones) followed by a period then additional digits, as it should take less steps:

^(\n|\d\.\d+\n)

Upvotes: 2

Kenneth K.
Kenneth K.

Reputation: 3039

You can make the search for the line breaks be optionally greedy:

0\.00\d+(\r?\n)*

The star modifies the group to be "zero or more". This matches the missing line breaks at the end of the data as well as the line breaks you want to remove. The \r is marked optional just to account for differences in Unix-style vs Windows-style. The rest of the pattern is pretty much as written: find a zero followed by a decimal point followed by a double-zero followed by one or more (+) digits followed by the optional line breaks.

Upvotes: 1

CertainPerformance
CertainPerformance

Reputation: 371213

One possibility is

^(?!\d{2}).*\n

and replace with the empty string. It matches all lines that don't start with 2 digits.

Upvotes: 1

Related Questions