tchakravarty
tchakravarty

Reputation: 10954

R: Replacing multiple matches with matches

I am trying to shorten some regex matches in strings. Here is an example

vYears = c('Democrat 2000-2004',
                 'Democrat 2004-2008',
                 'Democrat 2008-2012',
                 'Republican 2000-2004',
                 'Republican 2004-2008',
                 'Republican 2008-2012',
                 'Tossup')

I can match the expression that I want, and get the matches, like so

grepYears = gregexpr('20[0-9]{2}', vYears)
regmatches(vYears, grepYears)

However, I am trying to shorten the strings to

vYearsShort = c('Democrat 00-04',
           'Democrat 04-08',
           'Democrat 08-12',
           'Republican 00-04',
           'Republican 04-08',
           'Republican 08-12',
           'Tossup')

How can I achieve this?

Upvotes: 1

Views: 79

Answers (2)

devnull
devnull

Reputation: 123508

You could use gsub. Make use of backreferences to capture the desired part:

> vYears = c('Democrat 2000-2004',
+                  'Democrat 2004-2008',
+                  'Democrat 2008-2012',
+                  'Republican 2000-2004',
+                  'Republican 2004-2008',
+                  'Republican 2008-2012',
+                  'Tossup')
> vYearsShort = gsub("20([0-9]{2})", "\\1", vYears)
> vYearsShort
[1] "Democrat 00-04"   "Democrat 04-08"   "Democrat 08-12"   "Republican 00-04"
[5] "Republican 04-08" "Republican 08-12" "Tossup"          

Upvotes: 3

sshashank124
sshashank124

Reputation: 32189

You can match the following regex:

^(\w+\s)20(\d{2}-)20(\d{2})$

and replace with:

\1\2\3 or $1$2$3 or \\1\\2\\3

for earch string in your array.

DEMO

Upvotes: 1

Related Questions