Paul
Paul

Reputation: 982

Removing prefix from strings in R data frame

I have a data frame, wkt_small with the following data:

id             GEOMETRY                                                                                      
  <chr>          <chr>                                                                                         
1 PTK01        LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02        LINESTRING( 2.142 85.892 1.400, 0.991 85.892 1.400)
...

What I need is it to look like this:

id             GEOMETRY                                                                                      
  <chr>          <chr>                                                                                         
1 PTK01        ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02        ( 2.142 85.892 1.400, 0.991 85.892 1.400)
...

I have tried the following:

wkt_small[, 2] <- gsub('^\\w+', '', wkt_small[, 2])

This however gives me the following value for GEOMETRY for all rows:

("LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)","LINESTRING( 1.142 85.892 1.400, 0.991 85.892 1.400, 0.991 85.301 1.4)"...

concatenating the first row value with the string I want removed for all entries in the data frame.

Upvotes: 1

Views: 911

Answers (2)

TarJae
TarJae

Reputation: 78927

Update: We could use str_remove (which is better in this case):

library(stringr)
wkt_small %>% 
    mutate(GEOMETRY = str_remove(GEOMETRY, '^\\w+'))

We could use str_replace from stringr package with regular expression "^[A-Z]*"

library(dplyr)
library(stringr)
df %>% 
    mutate(GEOMETRY = str_replace(GEOMETRY, "^[A-Z]*", ""))

Output:

  id    GEOMETRY                                 
  <chr> <chr>                                    
1 PTK01 ( 1.142 85.892 1.400, 0.991 85.892 1.400)
2 PTK02 ( 2.142 85.892 1.400, 0.991 85.892 1.400)

Upvotes: 1

Konrad Rudolph
Konrad Rudolph

Reputation: 545598

Use [[…]] or $… to select a single column, not [, …]:

wkt_small$GEOMETRY <- sub('^\\w+', '', wkt_small$GEOMETRY)

… actually, with a proper data.frame your code would have worked as well; but with a tibble, [ indexing always returns a tibble, not a column vector. The tibble semantics are equivalent of using [, …, drop = FALSE] with a regular data.frame.

Upvotes: 4

Related Questions