Reputation: 903
I have a data.frame example
with a variable (care_group
) as follows:
> example
care_group
1 1st Choice Care Homes 8.8
2 2Care
3 229 Mitcham Lane Ltd
4 3 L Care Ltd
5 3AB Care Ltd
6 9Grace Road Ltd
7 A&R Care Ltd 9.7
8 ABLE (Action for a Better Life)
9 A C L Care Homes Ltd
10 A D L plc
11 A D R Care Homes Ltd
12 A G E Nursing Homes Ltd 8
As you may notice, some of my observations are alphanumeric and contain numbers both in the beginning and/or the end name. I know that it is possible to get rid of numeric characters (see for instance here). Yet, I do not know how to remove only some of them. Concretely, remove the numbers contained at the end of the name and keep those in the beginning. I have tried to do so by creating a group with the numbers that I want to remove and try to use gsub
.
ratings = c("8", "8.8", "9.7")
example$new_var = with(example, gsub(ratings, " ", care_group))
However I get this warning message:
Warning message:
In gsub(ratings, " ", care_group) :
argument 'pattern' has length > 1 and only the first element will be used
I wonder whether it is possible to use gsub with a pattern that has a length > 1 or whether someone could propose a more efficient way to tackle with this. Many thanks in advance.
Upvotes: 0
Views: 2339
Reputation: 38500
Better to use an anchor and character class:
# sample of vector with various possibilities
temp <- c(" 7 A&R Care Ltd 9.7", "A C L Care Homes Ltd", "12 A G E Nursing Homes Ltd 8")
gsub(" [0-9.]+$", "", temp)
[1] " 7 A&R Care Ltd" "A C L Care Homes Ltd" "12 A G E Nursing Homes Ltd"
In the regular expression
$
anchors the expression to the end of the textUpvotes: 1