bitshift
bitshift

Reputation: 6842

How to find matched lines with string pattern using variable length

Using Notepad++ and regex, I need to find all lines from a large (3MB) text file with a pattern like this:
"Could not find store with warehouseid: 12 and zipcode 55555"

The number following the warehouseid: could be 1 or two digits, whereas the number following the zipcode is always a space followed by 5 characters of a zipcode.

I want to select out all the substrings that include "warehouseid: __ and zipcode _____", so I would end up with a list of substrings like this:

"warehouseid: 14 and zipcode 44444"
"warehouseid: 5 and zipcode 44444 "
"warehouseid: 44 and zipcode 44444"
"warehouseid: 44 and zipcode 44444"
"warehouseid: 44 and zipcode 44444"

What Ive started with is this:
^.(warehouseid:).$

but now I want to select the next n characters starting with "warehouseid"

Upvotes: 1

Views: 39

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You may use

Find What:    .*(warehouseid:\h*\d{1,2})\b.*(zipcode\h*\d{5})\b.*|(.+)\R*
Replace With: (?{1}$1 and $2:)

Details

  • .* - any 0+ chars other than line break chars, as many as possible
  • (warehouseid:\h*\d{1,2})
  • \b - word boundary to ensure only 1 or 2 digits are captured into Group 1
  • .* - any 0+ chars other than line break chars, as many as possible
  • (zipcode\h*\d{5})
  • \b - word boundary to ensure only 5 digits are captured into Group 2
  • .* - any 0+ chars other than line break chars, as many as possible
  • | - or
  • (.+)\R* - a whole line that does not meet the criteria.

The (?{1}$1 and $2:) replacement pattern replaces with Group 1, and and Group 2 values or just removes the whole line that does not match the criteria.

enter image description here

Upvotes: 1

epinal
epinal

Reputation: 1465

This finds the whole line and give you the "warehouseid: __ and zipcode _____" as a group (selection):

"Could not find store with (warehouseid: \d{1,2} and zipcode \d{5})"

Check the explanation here.

If you want to get the warehouseid "XX" and the zipcode "XXXXX" as groups then use @Wiktor Stribiżew solution.

Upvotes: 0

Related Questions