Reputation: 607
I recently asked this question about row selection and kindly received a swift solution. However, I neglected that I need to perform this selection repeatedly for a factor ("date", see below). There are likely more elegant ways to do this, but I thought I could use a simple for loop. The loop runs, but I get the old problem of overwriting my result. I have looked at this post, but it doesn't solve the issue. This is what I have done:
row.number <- c(1:18)
date <- c(rep("A",5), rep("B", 6), rep("C",7))
ID <- c(1,1,2,2,2,1,1,1,2,2,3,1,1,2,2,2,3,3)
dat <- cbind(row.number,date,ID)
dat <- as.data.frame(dat)
IDU_date <- unique(date)
x <- data.frame(mode = "numeric", length = 0)
for(i in seq_along(IDU_date))
{
tf<-dat$date==IDU_date[i]
sub.dat<-dat[tf,]
x <- setDT(sub.dat)[, if(.N >1) .SD[ceiling(.N/2)] else .SD ,sub.dat$ID]
}
This is what I would like the result to look like:
row.number <- c(1,4,7,9,11,12,15,17)
date <- c("A","A","B","B","B","C","C","C")
ID <- c(1,2,1,2,3,1,2,3)
dat <- cbind(row.number,date,ID)
dat <- as.data.frame(dat)
Again, help would me much appreciated, even if it's not much of a challenge!
Upvotes: 0
Views: 1728
Reputation: 13139
I can be mistaken on the desired output, but using a for-loop to split a dataframe and turn it into a data.table
seems a little convoluted.
Why not split both by ID and date? This matches your desired output. And if N==1, ceiling.N/2 will be 1, negating the need for the if statement.
res <- dat[, .SD[ceiling(.N/2)], by=.(ID,date)]
Data used:
dat <- structure(list(row.number = c(1, 4, 7, 9, 11, 12, 15, 17), date = c("A",
"A", "B", "B", "B", "C", "C", "C"), ID = c(1, 2, 1, 2, 3, 1,
2, 3)), .Names = c("row.number", "date", "ID"), row.names = c(NA,
-8L), class = c("data.table", "data.frame"))
Upvotes: 2
Reputation: 1553
The loop automatically re-writes the value of x each time, so you need to bind each output of x within the loop:
library(data.table)
row.number <- c(1:18)
date <- c(rep("A",5), rep("B", 6), rep("C",7))
ID <- c(1,1,2,2,2,1,1,1,2,2,3,1,1,2,2,2,3,3)
dat <- cbind(row.number,date,ID)
dat <- as.data.frame(dat)
all.x <- data.frame()
IDU_date <- unique(date)
x <- data.frame(mode = "numeric", length = 0)
for(i in seq_along(IDU_date))
{
tf<-dat$date==IDU_date[i]
sub.dat<-dat[tf,]
x <- setDT(sub.dat)[, if(.N >1) .SD[ceiling(.N/2)] else .SD ,sub.dat$ID]
all.x <- rbind(all.x, x)
}
Upvotes: 0