DataDog
DataDog

Reputation: 525

Gsub to identify string before first whitespace, only numbers

Just as the title says,

Here's some data

DF<- data.frame(StreetName=c("PO BOX 850", "555 Happy Lane"))

Here's my code

DF$StreetName <- sub(".*? (.+)", "\\1", DF$StreetName)

And I also tried this

DF$StreetName<- sub("\\d? (.+)", "\\1", DF$StreetName)

But both are killing my PO BOX addys.

What I need is

   StreetName
    PO BOX 850
    Happy Lane

Upvotes: 0

Views: 54

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626950

I suggest using

sub("^\\d+\\s*", "", DF$StreetName)

The pattern matches

  • ^ - start of string
  • \\d+ - 1 or more digits
  • \\s* - zero or more whitespaces.

Note that in case you want to only match the digits and at least 1 whitespace, you need to replace * with +.

See the regex demo.

> DF<- data.frame(StreetName=c("PO BOX 850", "555 Happy Lane"))
> sub("^\\d+\\s*", "", DF$StreetName)
[1] "PO BOX 850" "Happy Lane"

Upvotes: 1

RavinderSingh13
RavinderSingh13

Reputation: 133545

Could you please try following.

val1 <- c("PO BOX 850", "555 Happy Lane")
val1
sub("^[0-9]+[[:space:]]+","",val1)

Output will be as follows.

[1] "PO BOX 850" "Happy Lane"

This is implemented to a vector as an example you could use it for data frame's values too.

Upvotes: 1

Related Questions