Reputation: 1456
I have a column that was combined when I imported data in R.
The column has data that looks like this c(-122.430061, 37.785553)
. How can I split this into just two columns long
and lat
?
Data looks like this:
#dput(coords[1:5,])
structure(list(type = c("Point", "Point", "Point", "Point", "Point"
), coordinates = list(c(-122.191986, 37.752671), c(-122.20254,
37.777845), c(-122.250701, 37.827707), c(-122.270252, 37.806838
), c(-122.259369, 37.809819))), .Names = c("type", "coordinates"
), row.names = c(1L, 2L, 3L, 5L, 6L), class = "data.frame")
Upvotes: 0
Views: 3085
Reputation: 73315
Well, after looking at your data, this seems the right way to go:
x <- structure(list(type = c("Point", "Point", "Point", "Point", "Point"
), coordinates = list(c(-122.191986, 37.752671), c(-122.20254,
37.777845), c(-122.250701, 37.827707), c(-122.270252, 37.806838
), c(-122.259369, 37.809819))), .Names = c("type", "coordinates"
), row.names = c(1L, 2L, 3L, 5L, 6L), class = "data.frame")
x$coordinates
is not a string column, but a list:
#[[1]]
#[1] -122.19199 37.75267
#
#[[2]]
#[1] -122.20254 37.77784
#
#[[3]]
#[1] -122.25070 37.82771
#
#[[4]]
#[1] -122.27025 37.80684
#
#[[5]]
#[1] -122.25937 37.80982
We can use an sapply
with "["
:
long <- sapply(x$coordinates, "[", 1)
# [1] -122.1920 -122.2025 -122.2507 -122.2703 -122.2594
lat <- sapply(x$coordinates, "[", 2)
# [1] 37.75267 37.77784 37.82771 37.80684 37.80982
But a more efficient way is via the trick used in my original answer below:
xx <- unlist(x$coordinates)
long <- xx[seq(1,length(xx),2)]
# [1] -122.1920 -122.2025 -122.2507 -122.2703 -122.2594
lat <- xx[-seq(1,length(xx),2)]
# [1] 37.75267 37.77784 37.82771 37.80684 37.80982
Original Answer
I think this is possibly what you are looking for, assuming you have a character column (if it is a factor at the moment, use as.character
for coercion first):
## example column
x <- c("12.3, 15.2", "9.2,11.1", "13.7,22.5")
#[1] "12.3, 15.2" "9.2,11.1" "13.7,22.5"
xx <- scan(text = x, what = numeric(), sep = ",")
#[1] 12.3 15.2 9.2 11.1 13.7 22.5
long <- xx[seq(1,length(xx),2)]
#[1] 12.3 9.2 13.7
lat <- xx[-seq(1,length(xx),2)]
#[1] 15.2 11.1 22.5
Upvotes: 2
Reputation: 10671
If you don't want to rerun the import. library(tidyr)
has a nice function for this seperate()
datf <- tidyr::separate(datf, coordinates, into = c("long", "lat"), sep = ",")
datf$long <- gsub("c\\(", "", datf$long)
datf$lat <- gsub("\\)", "", datf$lat)
The gsub()
clean up is a little gross, but it gets the job done. Maybe someone can improve on my separate
call.
Upvotes: 1