Reputation: 85
I have a column of strings that's like this:
|Image
|---
|CR 00_01_01
|SF 45_04_07
|ect
I want to get an end result of this:
| Condition | Time |
| --- | --- |
| CR | 00 |
I have 2 steps of doing this but it's very cumbersome. Essentially, I split the string twice first using space and second using _.
df <- df[, c("Condition","T") := tstrsplit(Image, " ", fixed=T)]
df <- df[, c("Time") := tstrsplit(T, "_", fixed=TRUE, keep = 1L)]
Is there any better way of doing this?
Upvotes: 0
Views: 952
Reputation: 5138
Here is a strsplit
solution that sounds like it is what you are looking for. Split based on space or underscore and select first two elements.
split_string <- strsplit(df1$Image, split = "\\s|_")
data.frame(Condition = sapply(split_string, `[`, 1),
Time = sapply(split_string, `[`, 2))
Condition Time
1 CR 00
2 SF 45
If the format of the Image
column is always the same, you could extract based on position.
data.frame(Condition = substr(df1$Image, 1, 2),
Time = substr(df1$Image, 4, 5))
Condition Time
1 CR 00
2 SF 45
Or you could just use regex to extract the letters / first pair of numbers.
data.frame(Condition = gsub("^([[:alpha:]]+).*", "\\1", df1$Image),
Time = gsub(".*[[:space:]]([[:digit:]]+)_.*", "\\1", df1$Image))
Condition Time
1 CR 00
2 SF 45
Data:
df1 <- data.frame(Image = c("CR 00_01_01", "SF 45_04_07"), stringsAsFactors = F)
Upvotes: 1
Reputation: 2467
You can try this using dplyr
and tidyr
df%>%separate(image,c("Image","Time")," ")%>%
mutate(Time=sub("([0-9]+).*","\\1",Time))
Image Time
1 CR 00
2 SF 45
Data
structure(list(image = c("CR 00_01_01", "SF 45_04_07")), class = "data.frame", row.names = c(NA,
-2L))
Upvotes: 1