Reputation:
I currently have a data frame of this structure
ID-No cigsaday activity
1 NA 1
2 NA 1
1 5 NA
2 5 NA
I want to concatenate the rows with the identical ID numbers and create a new data frame that is supposed to look like this
ID-No cigsaday activity
1 5 1
2 5 1
The data frame includes characters as well as numerical, in this way we would match based on a participant ID which occurs 4 times in the dataset within the first column.
Any help is appreciated!
Upvotes: 0
Views: 411
Reputation: 42544
Many ways lead to Rome. For the sake of completeness, here are some other approaches which return the expected result for the given sample dataset. Your mileage may vary.
library(dplyr)
df %>%
group_by(ID_No) %>%
summarise(across(everything(), na.omit))
`summarise()` ungrouping output (override with `.groups` argument) # A tibble: 2 x 3 ID_No cigsaday activity <int> <int> <int> 1 1 5 1 2 2 5 1
Note, this a dplyr
version of ThomasIsCoding's answer.
library(dplyr)
df %>%
group_by(ID_No) %>%
summarise(across(everything(), ~ purrr::reduce(.x, coalesce)))
library(data.table)
setDT(df)[, lapply(.SD, function(x) fcoalesce(as.list(x))), ID_No]
ID_No cigsaday activity 1: 1 5 1 2: 2 5 1
library(data.table)
setDT(df)[, lapply(.SD, Reduce, f = fcoalesce), ID_No]
Upvotes: 1
Reputation: 101064
A data.table
option
> setDT(df)[, lapply(.SD, na.omit), ID_No]
ID_No cigsaday activity
1: 1 5 1
2: 2 5 1
Data
> dput(df)
structure(list(ID_No = c(1L, 2L, 1L, 2L), cigsaday = c(NA, NA,
5L, 5L), activity = c(1L, 1L, NA, NA)), class = "data.frame", row.names = c(NA,
-4L))
Upvotes: 1
Reputation: 740
A possible solution using na.locf() which replaces a value with the most recent non-NA value.
library(zoo)
dat %>%
group_by(IDNo) %>%
mutate_at(vars(-group_cols()),.funs=function(x) na.locf(x)) %>%
distinct(IDNo,cigsaday,activity,.keep_all = TRUE) %>%
ungroup()
Upvotes: 0