Reputation: 1768
I hava a data frame like this:
df <- data.frame(c(1, 2), NA, NA, NA)
colnames(df) <- c("id", "2017-01-01", "2017-02-01", "2017-03-01")
And a list of data frames like this:
id_list <- list(data.frame(id = c(1, 1), date = c("2017-03-01", "2017-01-01")),
data.frame(id = c(2, 2), date = c("2017-02-01", "2017-03-01")))
My goal is to fill the date columns of df
with 0s and 1s depending on whether or not in id_list
a date occurs in the data frame of an id. Hence, the final output should be:
> df_final
id 2017-01-01 2017-02-01 2017-03-01
1 1 1 0 1
2 2 0 1 1
In reality, df
has 170 columns and 2400 rows; id_list
has 2400 data frames each with 1 - 100 rows and 20 columns. I should stress that the data frames in id_list
are not sorted by date.
EDIT: I just tried LAP's solution for:
df <- data.frame(c(1, 2), 0, 0, 0, 0)
colnames(df) <- c("id", "2017-01-01", "2017-02-01", "2017-03-01", "2017-04-01")
id_list <- list(data.frame(id = c(1, 1),
date = c("2017-03-01", "2017-01-01"),
stringsAsFactors = F),
data.frame(id = c(2, 2, 2),
date = c("2017-02-01", "2017-03-01", "2017-04-1"),
stringsAsFactors = F))
Unfortunately, the output was
> df
id 2017-01-01 2017-02-01 2017-03-01 2017-04-01
1 1 1 0 1 0
2 2 0 1 1 0
instead of
> df
id 2017-01-01 2017-02-01 2017-03-01 2017-04-01
1 1 1 0 1 0
2 2 0 1 1 1
EDIT2: I had a bad typo 2017-04-1
instead of 2017-04-01
Upvotes: 0
Views: 663
Reputation: 887118
Another option would be to rbind
the 'id_list' and then use a row/column
indexing method to assign the 1 values. If the other values should be 0, then it is better to construct with a 0 instead of NA
d1 <- do.call(rbind, id_list)
i1 <- cbind(match(d1$id, df$id), match(d1$date, names(df)[-1], nomatch = 0))
df[-1][i1] <- 1
df
# id 2017-01-01 2017-02-01 2017-03-01
#1 1 1 0 1
#2 2 0 1 1
df <- data.frame(c(1, 2), 0, 0, 0)
colnames(df) <- c("id", "2017-01-01", "2017-02-01", "2017-03-01")
Upvotes: 1
Reputation: 6685
You could use a for
loop over the columns while simultaneously using the column name as input for an sapply
call to loop through id_list
and check for the occurence of said name within the dataframes:
for(i in names(df)[-1]){
df[, i] <- as.numeric(sapply(id_list, function(x) i %in% x[, "date"]))
}
> df
id 2017-01-01 2017-02-01 2017-03-01
1 1 1 0 1
2 2 0 1 1
Upvotes: 2