Reputation:

Regex: find last occurance of word in string

I need to find last occurance of a word in a string (and replace it). So in following sentence I would be looking for the second "chocolate".

I love milk chocolate but I hate white chocolate.

How can that be achieved with regular expression? Could you please give me some explanation? Thanks.

Upvotes: 1

Answers (3)

ghoti

Reputation: 46826

If you want to match the second occurrence of any distinct word, you may be able to use a backreference, depending on the language and regex implementation you're in.

For example, in sed, you might do the following:

sed 's/\(.*\([[:<:]][[:alpha:]]*[[:>:]]\).*\)\(\2\)\(.*\)/\1russians\4/'

Breaking this down for easier reading, it looks like this:

s/ - substitute in sed
$.*\([[:<:]][[:alpha:]]*[[:>:]]$.*\)$\2$$.*$ - the search RE. Not really so complex....
- [[:<:]] and [[:>:]] are portable word boundaries,
- [[:alpha:]] is the class of alphabetical characters (words)
- $ and $ surround atoms for use in backreferences, in BRE (this is sed, remember)
\1russians\4 - replacement string consists of the first (outer) parenthesized backreference from the RE, followed by the replacement word, followed by the trailing characters.

For example:

$ t="I love milk chocolate but I hate white chocolate."
$ sed 's/\(.*\([[:<:]][[:alpha:]]*[[:>:]]\).*\)\(\2\)\(.*\)/\1russians\4/' <<<"$t"
I love milk chocolate but I hate white russians.
$ t="In a few years, your twenty may be worth twenty bucks."
$ sed 's/\(.*\([[:<:]][[:alpha:]]*[[:>:]]\).*\)\(\2\)\(.*\)/\1fifty\4/' <<<"$t"
In a few years, your twenty may be worth fifty bucks.
$

Upvotes: 0