Reputation: 39
I have a huge data frame like this
scan_id sample
1 s10w_00001 1
2 s10w_00002 2
3 s10w_00003 3
4 s10w_00004 4
5 s11d_00001 5
6 s11d_00002 6
7 s11d_00003 7
8 s11d_00004 8
9 s11w_00001 9
10 s11w_00002 10
11 s11w_00003 11
I want to add another column called size. However, that column should correspond with the scan_id column, where all the scan_id with 00001 at the end should be big, 00002=medium, 00003=small 00004=extra small.
the data frame should look something like this
scan_id sample sixe
1 s10w_00001 1 big
2 s10w_00002 2 medium
3 s10w_00003 3 small
4 s10w_00004 4 extra small
5 s11d_00001 5 big
6 s11d_00002 6 medium
7 s11d_00003 7 small
8 s11d_00004 8 extra small
9 s11w_00001 9 big
10 s11w_00002 10 medium
11 s11w_00003 11 small
how do i go about doing that?
Upvotes: 2
Views: 249
Reputation: 1906
Does this work for you:
library(tidyverse)
df %>%
separate(col = scan_id, into = c("scan", "id"), sep = "_") %>%
mutate(size = case_when(id == "00001" ~ "big",
id == "00002" ~ "medium",
TRUE ~ "small")) %>%
unite(col = "scan_id", c("scan", "id"), sep = "_")
Upvotes: 2
Reputation: 12420
Here is a solution. Note that when scan_id is neither 00001 nor 00002, size will be "small":
library(dplyr)
df_clean <- df %>%
mutate(size = ifelse(grepl("00001", scan_id), "big", "small")) %>%
mutate(size = ifelse(grepl("00002", scan_id), "medium", size)) %>%
mutate(size = ifelse(grepl("00003", scan_id), "small", size))
> df_clean
scan_id sample size
1 s10w_00001 1 big
2 s10w_00002 2 medium
3 s10w_00003 3 small
4 s10w_00004 4 extra small
5 s11d_00001 5 big
6 s11d_00002 6 medium
7 s11d_00003 7 small
8 s11d_00004 8 extra small
9 s11w_00001 9 big
10 s11w_00002 10 medium
11 s11w_00003 11 small
You should usually provide data with dput
which transforms your data.frame into text that is easy to read in. Here is the data I used:
df <- read.table(text =
"scan_id sample
1 s10w_00001 1
2 s10w_00002 2
3 s10w_00003 3
4 s10w_00004 4
5 s11d_00001 5
6 s11d_00002 6
7 s11d_00003 7
8 s11d_00004 8
9 s11w_00001 9
10 s11w_00002 10
11 s11w_00003 11", header = TRUE)
Upvotes: 0