Reputation: 311
I am a beginner with R. Now, I have a vector in a data.frame like this
city
Kirkland,
Bethesda,
Wellington,
La Jolla,
Berkeley,
Costa, Evie KW172NJ
Miami,
Plano,
Sacramento,
Middletown,
Webster,
Houston,
Denver,
Kirkland,
Pinecrest,
Tarzana,
Boulder,
Westfield,
Fair Haven,
Royal Palm Beach, Fl
Westport,
Encino,
Oak Ridge,
I want to clean it. What I want is all the city names before the comma. How can I get the result in R? Thanks!
Upvotes: 18
Views: 41527
Reputation: 1080
If the this was a column in a dataframe, we can use tidyverse.
library(dplyr)
x <- c("London, UK", "Paris, France", "New York, USA")
x <- as.data.frame(x)
x %>% separate(x, c("A","B"), sep = ',')
A B
1 London UK
2 Paris France
3 New York USA
Upvotes: 4
Reputation: 109844
This works as well:
x <- c("London, UK", "Paris, France", "New York, USA")
library(qdap)
beg2char(x, ",")
## > beg2char(x, ",")
## [1] "London" "Paris" "New York"
Upvotes: 2
Reputation: 49033
You can use gsub
with a bit of regexp :
cities <- gsub("^(.*?),.*", "\\1", df$city)
This one works, too :
cities <- gsub(",.*$", "", df$city)
Upvotes: 26
Reputation: 61154
Just for fun, you can use strsplit
> x <- c("London, UK", "Paris, France", "New York, USA")
> sapply(strsplit(x, ","), "[", 1)
[1] "London" "Paris" "New York"
Upvotes: 7
Reputation: 66834
You could use regexpr
to find the position of the first comma in each element and use substr
to snip them at this:
x <- c("London, UK", "Paris, France", "New York, USA")
substr(x,1,regexpr(",",x)-1)
[1] "London" "Paris" "New York"
Upvotes: 4