Reputation: 779
I'm trying to do some data cleaning to a file. The particular field I'm trying to clean describe what file it originally came from. Thus, there is always ".csv" at the end of the value in the field. I would like to remove this part of the value but keep the rest.
Here is an example of the field:
File Name
bagel.csv
donut.csv
hamburger.csv
carrots.csv
I would like the field to look something like this:
File Name
bagel
donut
hamburger
carrot
Is there a way to do this in R? Any assistance would be extremely appreciated.
Upvotes: 0
Views: 108
Reputation: 887038
We can use the file_path_sans_ext
from tools
tools::file_path_sans_ext(field)
#[1] "aa" "bb" "cc"
field <- c("aa.csv", "bb.csv", "cc.csv")
Upvotes: 0
Reputation: 6485
It's always better to provide a minimale reproducible example:
field <- c("aa.csv", "bb.csv", "cc.csv")
gsub("\\.csv$", "", field)
Returns:
[1] "aa" "bb" "cc"
Explanation:
We can use regex
to substitute the sequence:
"." (\\.
) followed by "csv" (csv
) followed by end-of-line ($
)
with an empty string (""
)
By following the suggestion from @G5W we make sure that, since we only want to remove the extensions, we don't accidentally replace the the string if it appears in the middle of a line (As an example: In "function.csv.txt" we wouldn't want to replace the ".csv" part)
Upvotes: 5
Reputation: 2323
You can also use dplyr
library(dplyr)
df <- data.frame(FileName = c('bagel.csv','donut.csv','hamburger.csv','carrots.csv'))
df <- df %>% mutate(FileName = gsub("\\..*","",FileName))
Upvotes: 1