Reputation: 15708
I have a file name with directory path returned from list.files(..., full.names = T)
. I want to split the file name up by /
to find the directory structure. I am having trouble only identifying single occurrences of /
, e.g.
strsplit("C://dir1/dir2/txt.R", "/")
# [[1]]
# [1] "C:" "" "dir1" "dir2" "txt.R"
when I desire the output to be:
[1] "C://" "dir1" "dir2" "txt.R"
I was looking at this answer that seems to give a regex answer, however, I get an error when I try to get a 'literal' match:
> strsplit("C://dir1/dir2/txt.R", "\/")
Error: '\/' is an unrecognized escape in character string starting ""\/"
In fact, the regex in that example does not work in R
:
> grepl('([\w\/]+)\/amp(\/\w+[-\/]\w+\/?)', '/name/amp/test-123')
Error: '\w' is an unrecognized escape in character string starting "'([\w"
Upvotes: 0
Views: 114
Reputation: 626738
A very simple matching approach would be
x <- "C://dir1/dir2/txt.R"
regmatches(x, gregexpr("[^/]+(?://)?", x))
# or with stringr
str_extract_all(x, "[^/]+(?://)?")
# [[1]]
# [1] "C://" "dir1" "dir2" "txt.R"
See the regex demo and the R online demo.
Pattern details
[^/]+
- 1 or more chars other than /
(?://)?
- an optional sequence of two /
.Note that in case you want to ignore //
inside the path and only grab them in the beginning, you may add an alternative like ^[[:alpha:]]://
or a lookbehind (?<=^[[:alpha:]]:)
to the optional group:
regmatches(x, gregexpr("[^/]+(?:(?<=^[[:alpha:]]:)//)?", x, perl=TRUE))
# or
regmatches(x, gregexpr("^[[:alpha:]]://|[^/]+", x))
See this and that regex demo.
Upvotes: 2
Reputation: 174696
KISS,
strsplit("C://dir1/dir2/txt.R", "\\b/\\b|(?<=//)", perl = TRUE)[[1]]
# [1] "C://" "dir1" "dir2" "txt.R"
Upvotes: 2
Reputation: 10360
Try this code:
strsplit("C://dir1/dir2/txt.R", "(?<=//)|(?<!/)/(?!/)", perl=TRUE)
Explanation:
(?<=//)
- finds the position immediately preceded by a //
|
- OR(?<!/)/(?!/)
- matches a /
which is neither preceded by a /
nor followed by a /
Upvotes: 3
Reputation: 887028
One option would be to match more than one occurence of /
and SKIP
it while splitting on the single /
or the word boundary that succeeds after the /
strsplit("C://dir1/dir2/txt.R", "[/]{2,}(*SKIP)(*F)|\\b[/]|(?<=[/])\\b", perl = TRUE)[[1]]
#[1] "C://" "dir1" "dir2" "txt.R"
Upvotes: 2