Reputation: 21

notepad++ regular expression how to replace nth occurance of the string ", " with comma-newline

I have been trying to understand regular expressions so I can find every nth occurance of , (comma-space) with ,\r\n (comma-carriage return) to clarify: i want to replace n+1 occurances of the string and replace it with ,\r\n (comma-carriage return). there must be a comma at the end of the line.

the original data will looks like

"aa", "aah", "aal", "aalii", "aardvark", "aardvarks", "aardwolf", "aardwolves", "aargh", "aarrghh", "aasvogel", "aasvogels", "ab", "aba", "abaca", "abacas"

the proceeding / character marks the beginning and end of the regex

the expression (, ) matches correctly. ive tried /(, ).{n}/ and /(, ){n}/ and no luck. my desired output is something like this

"aa", "aah", "aal", "aalii",
"aardvark", "aardvarks", "aardwolf", "aardwolves",
"aarrghh", "aasvogel", "aasvogels", "ab",
"aba", "abaca", "abacas", "abaci",

in this case ive replaced every 5th occurance of (, ) with a newline. it would be great if the regex could be easily modified to accept nth occurances. my total dataset is in the 49,000 word range

Upvotes: 2

Answers (2)

Jerry Jeremiah

Reputation: 9618

The (...){n} doesn't work like that. If you match defabcdefdefghidef against {def){2} then it will match the defdef in the middle but the capture group is the second instance of def in that match - the first def in the match is lost and doesn't capture at all. So saying (, ){3} will match , , , and that doesn't exist in your data. You could do ("[^"]+", ){3} and that will match "abc", "def", "ghi", but you can't replace it with \1\r\n because the capture group is only "ghi", so the result would be that "abc", "def", gets deleted.

You need to NOT use {n} at all. Instead of ("[^"]+", ){3} use ("[^"]+", "[^"]+", "[^"]+", ) and replace it with \1\r\n

Online example: https://www.myregextester.com/?r=3d00df0a

Upvotes: 1