brucezepplin
brucezepplin

Reputation: 9782

grep formatted number using r

I have a string format that I would like to select from a character vector. The form is

123 123 1234 

where the two spaces can also be a hyphen. i.e. 3 digits followed by space or hyphen, followed by 3 digits, followed by space or hyphen, followed by 4 digits

I am trying to do this by the following:

grep("^([0-9]{3}[ -.])([0-9]{3}[ -.])([0-9]{4}$)",mytext)

however this yields:

integer(0)

What am I doing wrong?

Upvotes: 0

Views: 1036

Answers (1)

Lucas Araujo
Lucas Araujo

Reputation: 1688

Your string has a whitespace at the end, so you can either consider that white space, like so:

grep("^([0-9]{3}[ -.])([0-9]{3}[ -.])([0-9]{4} $)",mytext)

Or remove the end of line assertion "$", like so:

grep("^([0-9]{3}[ -.])([0-9]{3}[ -.])([0-9]{4})",mytext)

Also, as pointed out by Wiktor Stribiżew, the character class [ -.] will match any character in the range between " " and ".". To match "-","." and " " you have to escape the "-" or put it at the end of the class. Like [ \-.] or [ .-]

Upvotes: 2

Related Questions