Reputation: 51
I have a dataframe called "rawfile" as below.
structure(list(mz = c(123, 243, 277, 456, 886.555, 886.556, 887.565,
887.567, 887.568), condition.1 = c(0.001, 0.003, 2e-04, NA, 0.000511135,
NA, NA, 0.000162055, NA), condition.2 = c(NA, NA, 3e-04, 0.007,
NA, 0.000661594, 1e-04, 3e-04, NA), condition.3 = c(NA, NA, 4e-04,
NA, NA, 0.000502741, NA, NA, 2e-04), group = c(1, 2, 3, 4, 5,
5, 6, 6, 6)), row.names = c(NA, -9L), class = "data.frame")
I want to move all the values to the top row within each group if there is more than one row present in one group. I used the following code and it works if there is only one value within each column in one group.
# Function to move all values to the top row based on group
move_values <- function(df) {
if (nrow(df) > 1) {
for (col in grep("^condition", names(df), value = TRUE)) {
df[1, col] <- ifelse(any(!is.na(df[, col])), df[!is.na(df[, col]), col], NA)
df[-1, col] <- NA # Fill other rows with NA for the current column
}
}
return(df)
}
# Group the data by 'group' and apply the move_values function
result <- rawfile %>%
group_by(group) %>%
do(move_values(.)) %>%
ungroup()
However, for this dataframe, there is more than one value in the same column (condition 2) in group 6, and I want to keep this group unchanged if this happens.
Does anyone have any suggestions on how to filter out groups with more than one values in the same column? And move the values to the top row for other groups?
The expected dataframe I would like to have in the end is below:
structure(list(mz = c(123, 243, 277, 456, 886.555, 886.556, 887.565,
887.567, 887.568), condition.1 = c(0.001, 0.003, 2e-04, NA, 0.000511135,
NA, NA, 0.000162055, NA), condition.2 = c(NA, NA, 3e-04, 0.007,
0.000661594, NA, 1e-04, 3e-04, NA), condition.3 = c(NA, NA, 4e-04,
NA, 0.000502741, NA, NA, NA, 2e-04), group = c(1L, 2L, 3L, 4L,
5L, 5L, 6L, 6L, 6L)), class = "data.frame", row.names = c(NA,
-9L))
Upvotes: 2
Views: 37
Reputation: 51994
You can create the cond
ition with by sum
ming the complete.cases
across
the selected columns, and then use this cond
ition in an ifelse
statement, and sort complete and incomplete cases.
library(dplyr)
df %>%
mutate(cond = rowSums(across(condition.1:condition.3, \(x) sum(complete.cases(x))) > 1),
across(condition.1:condition.3, \(x) ifelse(cond == 0, c(x[!is.na(x)], x[is.na(x)]), x)),
.by = group)
# mz condition.1 condition.2 condition.3 group cond
# 1 123.000 0.001000000 NA NA 1 0
# 2 243.000 0.003000000 NA NA 2 0
# 3 277.000 0.000200000 0.000300000 0.000400000 3 0
# 4 456.000 NA 0.007000000 NA 4 0
# 5 886.555 0.000511135 0.000661594 0.000502741 5 0
# 6 886.556 NA NA NA 5 0
# 7 887.565 NA 0.000100000 NA 6 1
# 8 887.567 0.000162055 0.000300000 NA 6 1
# 9 887.568 NA NA 0.000200000 6 1
Upvotes: 0