Stefan
Stefan

Reputation: 25

R - Extract coordinates from String in dataframe

I have data such as this in an R dataframe - these are all placed in one column called SHAPE (below is just an excerpt):

I would like to extract the coordinates so that they will be placed in a column "X" and a column "Y" of my dataframe in number format. The challenge is that the numbers are not always the same length.

Result should look like this

Column X:

Column Y:

Upvotes: 1

Views: 692

Answers (2)

Jan
Jan

Reputation: 43169

Just to provide another solution, this time using strsplit() and lapply():

df <- data.frame(SHAPE = c("POINT (16.361866982751053 48.177421074512125)",
                           "POINT (16.30410258091979 48.16069903617549)",
                           "POINT (16.226971074542572 48.20539106235006)",
                           "POINT (16.36781410799229 48.25479849185693)"),
                 stringsAsFactors = F)

df[c("x", "y")] <- do.call(rbind, lapply(strsplit(df$SHAPE, "[()]"), function(col) {
  (parts <- unlist(strsplit(col[2], " ")))
}))
df

This yields

                                          SHAPE                  x                  y
1 POINT (16.361866982751053 48.177421074512125) 16.361866982751053 48.177421074512125
2   POINT (16.30410258091979 48.16069903617549)  16.30410258091979  48.16069903617549
3  POINT (16.226971074542572 48.20539106235006) 16.226971074542572  48.20539106235006
4   POINT (16.36781410799229 48.25479849185693)  16.36781410799229  48.25479849185693
> 

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520918

Use sub:

point <- "POINT (16.361866982751053 48.177421074512125)"
x <- sub("POINT \\((\\d+\\.\\d+) \\d+\\.\\d+\\)", "\\1", point, perl=TRUE)
y <- sub("POINT \\(\\d+\\.\\d+ (\\d+\\.\\d+)\\)", "\\1", point, perl=TRUE)

Demo

Upvotes: 1

Related Questions