find alphanumeric elements in vector

Question

I have a vector

    myVec <- c('1.2','asd','gkd','232','4343','1.3zyz','fva','3213','1232','dasd')

In this vector, I want to do two things:

Remove any numbers from an element that contains both numbers and letters and then
If a group of letters is followed by another group of letters, merge them into one.

So the above vector will look like this:

'1.2','asdgkd','232','4343','zyzfva','3213','1232','dasd'

I thought I will first find the alphanumeric elements and remove the numbers from them using gsub. I tried this

    gsub('[0-9]+', '', myVec[grepl("[A-Za-z]+$", myVec, perl = T)])

    "asd"  "gkd"  ".zyz" "fva"  "dasd"

i.e. it retains the . which I don't want.

MrFlick · Accepted Answer

This seems to return what you are after

myVec <- c('1.2','asd','gkd','232','4343','1.3zyz','fva','3213','1232','dasd')


clean <- function (x) {
  is_char <- grepl("[[:alpha:]]", x)
  has_number <- grepl("\d", x)
  mixed <- is_char & has_number
  x[mixed] <- gsub("[\d\.]+","", x[mixed], perl=T)
  grp <- cumsum(!is_char | (is_char  & !c(FALSE, head(is_char, -1))))
  unname(tapply(x, grp, paste, collapse=""))
}

clean(myVec)
# [1] "1.2"    "asdgkd" "232"    "4343"   "zyzfva" "3213"   "1232"   "dasd"

Here we look for numbers and letters mixed together and remove the numbers. Then we defined groups for collapsing, looking for characters that come after other characters to put them in the same group. Then we finally collapse all the values in the same group.

find alphanumeric elements in vector

Answers (2)

Related Questions