using regular expressions with R

Question

I have an array of characters in R. Some of the strings have a '(number)' pattern appended to that string. I'm trying to remove this '(number)' string from using regular expressions but cannot figure it out. I can access the rows of all the rows where the string has a whitespace than a character but there must be a way to find these number strings.

  dat <- c("Alabama-Birmingham", "Arizona State", "Canisius", "UCF", "George Washington", 
             "Green Bay", "Iona", "Louisville (7)", "UMass", "Memphis", "Michigan State", 
             "Milwaukee", "Nebraska", "Niagara", "Northern Kentucky", "Notre Dame (21)", 
             "Quinnipiac", "Siena", "Tulsa", "Washington State", "Wright State", 
             "Xavier")

    rows <- grep(" (.*)", dat)
    fixed <- gsub(" (.*)","",games[rows,])
    dat = fixed

G5W · Accepted Answer

First, you need to escape the parentheses and it would be good to be more specific about what is inside them

gsub("\s+$\d+$", "", dat)
 [1] "Alabama-Birmingham" "Arizona State"      "Canisius"          
 [4] "UCF"                "George Washington"  "Green Bay"         
 [7] "Iona"               "Louisville"         "UMass"             
[10] "Memphis"            "Michigan State"     "Milwaukee"         
[13] "Nebraska"           "Niagara"            "Northern Kentucky" 
[16] "Notre Dame"         "Quinnipiac"         "Siena"             
[19] "Tulsa"              "Washington State"   "Wright State"      
[22] "Xavier"

using regular expressions with R

Answers (2)

Related Questions