Rosenberg
Rosenberg

Reputation: 2444

Remove everything after a 10 digit number regex

I have millions of lines like the one below:

0032       0001        0020413300               0001           BLUE OVERCOAT               CC                 P

I managed to remove the text before by using:

.*(?=\d{10})

To remove everything after I'm trying this:

\d{10}.*

But it's selecting the number & everything after. How can I leave the number and select everything after?

Upvotes: 1

Views: 1991

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

Use a capturing group:

(\d{10}).*

and replace with $1. See the regex demo.

Also consider using word boundaries to match 10-digit number as a whole word:

\b(\d{10})\b.*

See another regex demo.

If you are working in Notepad++, you may use \K that will drop the matched text up its position: \b\d{10}\b\K.* and replace with an empty string. Or a lookbehind: (?<=\b\d{10}\b).*.

See yet another demo.

Upvotes: 1

Related Questions