user3026965
user3026965

Reputation: 703

Delete everything between two characters in each line

For each line, how to delete everything between the 5th occurrence of " and the last occurrence of ., noninclusive? The section that is to be deleted contain any number and variable pattern of problematic (for regex) characters such as :/\()[]|.,?", etc.

For example:

"123456789","xyxyxy","DELETE///.T.H.I.S.aaa"
"123","abc","DELETE."\T.H.I.S\[.]".1234"

"123456789","xyxyxy",".aaa"
"123","abc",".1234"

I keep failing (possibly because of incorrect escaping of problematic characters?).

Upvotes: 1

Views: 999

Answers (2)

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

Try this Regex: ((?:[^"\n]*"){5})(.*)(\..*)$

Click for Demo

Explanation:

  • ^ - asserts the start of the string
  • (?:[^"\n]*") - matches 0+ occurrences of any character which is neither a " nor a newline character greedily followed by a "
  • {5} - repeats the above match 5 times. Everything matched so far is captured in group 1.
  • (.*) - match and capture 0+ occurrences of any character greedily but not a newline character. This is stored is group2. It is this part which will later be removed/deleted. Greedy match because we now want to reach the last .. We can reach there by backtracking(in the next step)
  • (\..*) - match a dot followed by 0+ occurrences of any character but a newline character and store in group 3
  • $ - asserts the end of the string

Output:

Before Replacing:

enter image description here

After Replacing:

enter image description here

Upvotes: 1

guido
guido

Reputation: 19194

Not sure it is compatible with notepad++, but this regex should do he job:

((?:"[^"]*){4}").*(\..*)

with replacement:

\1\2

Example and explanation: https://regex101.com/r/yBuUOj/3

Upvotes: 1

Related Questions