Manoel Galdino
Manoel Galdino

Reputation: 2396

How to replace space with "_" after last slash in a string with R

I have a list of strings, and for each string, I need to replace all spaces after the last slash with an "_". Here's a minimum reproducible example.

my_list <- list("abc/as 345/as df.pdf", "adf3344/aer4 ffsd.doc", "abc/3455/dfr.xls", "abc/3455/dfr serf_dff.xls", "abc/34 5 5/dfr 345 dsdf 334.pdf")

After doing the replacement, the result should be:

list("abc/as 345/as_df.pdf", "adf3344/aer4_ffsd.doc", "abc/3455/dfr.xls", "abc/3455/dfr_serf_dff.xls", "abc/34 5 5/dfr_345_dsdf_334.pdf")

I thought of matching the text after the last slash using regex, and then replace " " for "_", but didn't find a way to implement it. It would be something like this: gsub(pattern, "_", my_list), in which pattern would be a regex that would be saying: match every space after the last slash (there is at least one slash in every element of the list).

Upvotes: 2

Views: 531

Answers (4)

moodymudskipper
moodymudskipper

Reputation: 47340

You can use dirname, basename and file.path :

as.list(file.path(
  dirname(unlist(my_list)),
  gsub(" ", "_", basename(unlist(my_list)))
  ))
# [[1]]
# [1] "abc/as 345/as_df.pdf"
# 
# [[2]]
# [1] "adf3344/aer4_ffsd.doc"
# 
# [[3]]
# [1] "abc/3455/dfr.xls"
# 
# [[4]]
# [1] "abc/3455/dfr_serf_dff.xls"
# 
# [[5]]
# [1] "abc/34 5 5/dfr_345_dsdf_334.pdf"

or a bit more efficient and compact :

as.list(file.path(
  dirname(. <- unlist(my_list)),
  gsub(" ", "_", basename(.))
))

Upvotes: 2

bschneidr
bschneidr

Reputation: 6277

Here's a solution that uses the gsubfn package. You use the regex (/[^/]+)$ to find the content following the last slash and you edit that content with a function that converts spaces to underscores.

library(gsubfn)

change_space_to_underscore <- function(x) gsub(x = x, pattern = "[[:space:]]+", replacement = "_")

gsubfn(x = my_list,
       pattern = "(/[^/]+)$",
       replacement = change_space_to_underscore)

Upvotes: 1

Julius Vainora
Julius Vainora

Reputation: 48241

You may use negative lookahead:

gsub(" (?!.*/.*)", "_", unlist(my_list), perl = TRUE)
# [1] "abc/as 345/as_df.pdf"            "adf3344/aer4_ffsd.doc"          
# [3] "abc/3455/dfr.xls"                "abc/3455/dfr_serf_dff.xls"      
# [5] "abc/34 5 5/dfr_345_dsdf_334.pdf"

Here we match and replace all such spaces that ahead of them there are no more slashes left.

Upvotes: 5

r2evans
r2evans

Reputation: 160637

Here's a thought. First, split by slash:

l2 <- strsplit(unlist(my_list), "/")
l2
# [[1]]
# [1] "abc"       "as 345"    "as df.pdf"
# [[2]]
# [1] "adf3344"       "aer4 ffsd.doc"
# [[3]]
# [1] "abc"     "3455"    "dfr.xls"
# [[4]]
# [1] "abc"              "3455"             "dfr serf_dff.xls"
# [[5]]
# [1] "abc"                  "34 5 5"               "dfr 345 dsdf 334.pdf"

Now we do a gsub on just the last element of each split-string, recombining with slashes:

mapply(function(a,i) paste(c(a[-i], gsub(" ", "_", a[i])), collapse="/"),
       l2, lengths(l2), SIMPLIFY=FALSE)
# [[1]]
# [1] "abc/as 345/as_df.pdf"
# [[2]]
# [1] "adf3344/aer4_ffsd.doc"
# [[3]]
# [1] "abc/3455/dfr.xls"
# [[4]]
# [1] "abc/3455/dfr_serf_dff.xls"
# [[5]]
# [1] "abc/34 5 5/dfr_345_dsdf_334.pdf"

Upvotes: 1

Related Questions