Reputation: 45
This is a little subset of the data :
I have :
Id var1 var2
1 POS NA
1 NA NEG
2 NEG NA
2 NA NEG
3 POS NA
3 NA NEG
4 POS POS
5 POS NA
My ideal output
Id var1 var2
1 POS NEG
2 NEG NEG
3 POS NEG
4 POS POS
5 POS NA
I would simply like to delete duplicated Id and have one row per unique id with the good result in var1 and var2. Anyone see the issue? Help would be greatly appreciated. Thank you !
Upvotes: 1
Views: 64
Reputation: 30504
You could try a solution with na.omit
. This function will remove NA
within each group. Assuming your data frame is df
...
In base R:
aggregate(. ~ Id,
data = df,
FUN = function(x) {
y = na.omit(x)
y[length(y) == 0] <- NA
y
},
na.action = "na.pass")
Note that y[length(y) == 0]
is included to ensure cases like Id
5 and var2
are NA
and not character(0)
.
With dplyr
:
library(dplyr)
df %>%
group_by(Id) %>%
summarise(across(everything(), ~ first(na.omit(.))))
Using first
will include the first value within the group after NA
removed. across(everything())
will apply this method to all columns.
With data.table
:
library(data.table)
setDT(df)[, lapply(.SD, na.omit), by = Id]
Upvotes: 0
Reputation: 1284
The already listed solution is way more compact, but i was working on this and therefore posting it for additional info. For loop solution.
library(data.table)
#convert dt to a data table
setDT(dt)
#create list to bind results to of the for loop
result <- list()
#create for loop
for(i in unique(dt$Id)){
#subset a unique ID and store it in dt1
dt1 <- dt[var3 == i]
#create a data table to add results too
dt1.dt <- data.table()
#add the ID to the data table
dt1.dt[, ID := i]
#add var1 to the data.table (value which is not NA)
dt1.dt[, var1 := dt1[!is.na(var1)]$var1]
#do the same for var2
dt1.dt[, var2 := dt1[!is.na(var2)]$var1]
#add the results to the list created before the for loop.
result[[i]] <- dt1.dt
}
#rbind the list
result <- do.call(rbind, result)
Upvotes: 0
Reputation: 11981
you can use dplyr
:
library(dplyr)
mydata %>%
group_by(ID) %>%
summarise(
var1 = var1[!is.na(var1)][1],
var2 = var2[!is.na(var2)][1]
)
Upvotes: 1