Joseph Noirre
Joseph Noirre

Reputation: 387

stringr str_locate_all not returning the proper index in a dplyr string

I'm trying to use str_locate_all to find the index of the third occurrence of '/' in a dplyr chain but it's not returning the correct index.

  ga.categoryViews.2016 <- ga.data %>%
    mutate(province = str_sub(pagePath,2,3),
           index = str_locate_all(pagePath, '/')[[1]][,"start"][3],
           category = str_sub(pagePath, 
                              str_locate_all(pagePath, '/')[[1]][,"start"][3] + 1,
                              ifelse(str_detect(pagePath,'\\?'), str_locate(pagePath, '\\?') - 1, str_length(pagePath))
                              )
             )

an example of what it's returning is

enter image description here

The first column is pagePath, the fourth is the index

It seems to be always returning an index of 12.

Any help is appreciated.

Thanks,

Upvotes: 1

Views: 886

Answers (1)

Sotos
Sotos

Reputation: 51592

You need to use rowwise(), i.e.

library(dplyr)
library(stringr)

df %>% 
 rowwise() %>% 
 mutate(new = str_locate_all(v1, '/')[[1]][,2][3])

Source: local data frame [2 x 2]
Groups: <by row>

# A tibble: 2 x 2
#                              v1   new
#                           <chr> <int>
#1 /on/srgsfsfs-gfdgdg/dfgsdfg-df    20
#2        /on/sgsddg-dfgsd/dfg-dg    17

DATA

x <- c('/on/srgsfsfs-gfdgdg/dfgsdfg-df', '/on/sgsddg-dfgsd/dfg-dg')
df <- data.frame(v1 = x, stringsAsFactors = F)

df
#                              v1
#1 /on/srgsfsfs-gfdgdg/dfgsdfg-df
#2        /on/sgsddg-dfgsd/dfg-dg

Upvotes: 3

Related Questions