Ktakl
Ktakl

Reputation: 5

Notepad++ remove all non regex'd text

I have a large list of urls that has a unique numeric string in each, the string falls between a / and a ? I would like to remove all other text from notepad++ that are not these strings. for example www.website.com/dsw/fv3n24nv1e4121v/123456789012?fwe=32432fdwe23f3 would end up as only 123456789012

I have figured out that the following regex \b\d{12}\b will get me the 12 digits, now I just need to remove all of the information that falls each side. I have had a look and found some posts that suggest replace with \t$1 , $1\n , $1 , and /1 however all of these do the exact oposite of what I want and just remove the 12 digit string.

Upvotes: 0

Views: 329

Answers (2)

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

You can use this regex and replace it with empty string,

^[^ ]*\/|\?[^ ]*$

Demo

Explanation:

  • ^[^ ]*\/ --> Matches anything expect space from start of string till it finds a /
  • \?[^ ]*$ --> Similarly, this matches anything except space starting from ? till end of input.

Upvotes: 1

Toto
Toto

Reputation: 91375

  • Ctrl+H
  • Find what: ^.*/([^?]+).*$
  • Replace with: $1
  • check Wrap around
  • check Regular expression
  • UNCHECK . matches newline
  • Replace all

Explanation:

^               # beginning of line
    .*          # 0 or more any character but newline
    /           # a slash
    ([^?\r\n]+) # group 1, 1 or more any character that is not ? or line break
    .*          # 0 or more any character but newline
$               # end of line

Result for given example:

123456789012

Upvotes: 0

Related Questions