Reputation: 131
I have a data frame that contains columns of values, one of which is United States Postal Zip codes.
Row_num Restaurant Address City State Zip
26698 m 1460 Memorial Drive Chicopee MA 01020-3964
For this entry, I want to only have the 5 digit zip code 01020 and remove the "-3964" after it and do this for every entry in my data frame. Right now the zip code column is being treated as a chr by r.
I have tried the following gsub code:
df$Zip <- gsub(df$Zip, pattern="-[0,9]{0,4}", replacement = "")
However, all that does is replace the "-" with no space. Not only is that not what I want but it is also not what I expected so any help as to how gsub behaves and how to get the desired result would be appreciated.
Thank you!
Edit: I have found out through trial and error that this block of code works as well
df$Zip <- gsub(df$Zip, pattern="-.*", replacement = "")
Upvotes: 0
Views: 1146
Reputation: 263331
The character class you defined has only three elements 0, 9, and ",". Inside character class brackets you need to use dash as the range operator, so try:
df$Zip <- gsub(df$Zip, pattern="-[0-9]{0,4}", replacement = "")
Upvotes: 1