Reputation: 51
Hi this is an example of a similar dataframe I am working with. I have an experiment with 10 samples and two replicates
df <- data.frame("ID" = c(1,2,3,4,5,6,7,8,9,10),
"Rep1" = c(6,5,3,"Na","Na",9,4,"Na","Na",2),
"Rep2" = c(8,4,4,"Na",3,"Na",6,"Na",2,1))
I have different Na values, however, I only want to replace them with zeros in the samples 4 and 8 due to they are the only ones which have NA in both replicates. Then, other samples would maintain the "NA".
Upvotes: 3
Views: 76
Reputation: 26238
Though it has been marked as solved, yet I propose a simple answer
df <- data.frame("ID" = c(1,2,3,4,5,6,7,8,9,10),
"Rep1" = c(6,5,3,"Na","Na",9,4,"Na","Na",2),
"Rep2" = c(8,4,4,"Na",3,"Na",6,"Na",2,1))
library(dplyr)
df %>% group_by(ID) %>%
mutate(replace(cur_data(), all(cur_data() == 'Na'), '0'))
#> # A tibble: 10 x 3
#> # Groups: ID [10]
#> ID Rep1 Rep2
#> <dbl> <chr> <chr>
#> 1 1 6 8
#> 2 2 5 4
#> 3 3 3 4
#> 4 4 0 0
#> 5 5 Na 3
#> 6 6 9 Na
#> 7 7 4 6
#> 8 8 0 0
#> 9 9 Na 2
#> 10 10 2 1
OR
df %>% rowwise() %>%
mutate(replace(cur_data()[-1], all(cur_data()[-1] == 'Na'), '0'))
Upvotes: 1
Reputation: 21938
You can also use the following solution. In the following solution we iterate over each row and detect corresponding index or indices that is (are) equal to Na
then if there were more that one index we replace it with 0
otherwise the row will remain as it:
library(dplyr)
library(purrr)
df %>%
pmap_df(., ~ {ind <- which(c(...) == "Na");
if(length(ind) > 1) {
replace(c(...), ind, "0")
} else {
c(...)
}
}
) %>%
mutate(across(ID, as.integer))
# A tibble: 10 x 3
ID Rep1 Rep2
<int> <chr> <chr>
1 1 6 8
2 2 5 4
3 3 3 4
4 4 0 0
5 5 Na 3
6 6 9 Na
7 7 4 6
8 8 0 0
9 9 Na 2
10 10 2 1
P.S = I almost went crazy as why I could not get it to work only to realize your NA
s are in fact Na
.
Upvotes: 2
Reputation: 79311
With dplyr
we could:
library(dplyr)
df %>%
mutate(across(starts_with("Rep"), ~case_when(.=="Na" & ID==4 | ID==8 ~ "0",
TRUE ~ .)))
Output:
ID Rep1 Rep2
1 1 6 8
2 2 5 4
3 3 3 4
4 4 0 0
5 5 Na 3
6 6 9 Na
7 7 4 6
8 8 0 0
9 9 Na 2
10 10 2 1
Upvotes: 1
Reputation: 887971
We create an index where the 'Rep' columns are both "Na" with rowSums
on a logical matrix. Use the row, column index/names to subset the data and assign the values to 0
nm1 <- grep("Rep", names(df), value = TRUE)
i1 <- rowSums(df[nm1] == "Na") == length(nm1)
df[i1, nm1] <- 0
-output
df
ID Rep1 Rep2
1 1 6 8
2 2 5 4
3 3 3 4
4 4 0 0
5 5 Na 3
6 6 9 Na
7 7 4 6
8 8 0 0
9 9 Na 2
10 10 2 1
As the OP created string "Na", the column types are not numeric
. We can convert this to numeric as
df[-1] <- lapply(df[-1], as.numeric)
forces the "Na" to be converted to NA
-output
df
ID Rep1 Rep2
1 1 6 8
2 2 5 4
3 3 3 4
4 4 0 0
5 5 NA 3
6 6 9 NA
7 7 4 6
8 8 0 0
9 9 NA 2
10 10 2 1
Upvotes: 2