Reputation: 703
For each line, how to delete everything between the 5th occurrence of "
and the last occurrence of .
, noninclusive? The section that is to be deleted contain any number and variable pattern of problematic (for regex) characters such as :/\()[]|.,?"
, etc.
For example:
"123456789","xyxyxy","DELETE///.T.H.I.S.aaa"
"123","abc","DELETE."\T.H.I.S\[.]".1234"
"123456789","xyxyxy",".aaa"
"123","abc",".1234"
I keep failing (possibly because of incorrect escaping of problematic characters?).
Upvotes: 1
Views: 999
Reputation: 10360
Try this Regex:
((?:[^"\n]*"){5})(.*)(\..*)$
Explanation:
^
- asserts the start of the string(?:[^"\n]*")
- matches 0+ occurrences of any character which is neither a "
nor a newline character greedily followed by a "
{5}
- repeats the above match 5 times. Everything matched so far is captured in group 1.(.*)
- match and capture 0+ occurrences of any character greedily but not a newline character. This is stored is group2. It is this part which will later be removed/deleted. Greedy match because we now want to reach the last .
. We can reach there by backtracking(in the next step)(\..*)
- match a dot followed by 0+ occurrences of any character but a newline character and store in group 3$
- asserts the end of the stringOutput:
Before Replacing:
After Replacing:
Upvotes: 1
Reputation: 19194
Not sure it is compatible with notepad++, but this regex should do he job:
((?:"[^"]*){4}").*(\..*)
with replacement:
\1\2
Example and explanation: https://regex101.com/r/yBuUOj/3
Upvotes: 1