Reputation: 143
I have two lists:
list 1:
id name age
1 jake 21
2 ashly 19
45 lana 18
51 james 23
5675 eric 25
list 2 (tv watch):
id hours
1 1.1
1 3
1 2.5
45 5.6
45 3
51 2
51 1
51 2
this is just an example, the real lists are very big :list 1 - 5000 id's, list 2/3/4 - has more then 1 million rows (not a unique id).
I need for every list 2 and up to calculate average/sum/count to every id and add the value to list 1. notice that I need the calculation saved in another list with different row numbers.
example:
list 1:
id name age tv_average
1 jake 21 2.2
2 ashly 19 n/a
45 lana 18 4.3
51 james 23 1.6667
5675 eric 25 n/a
this are my tries:
for (i in 1:nrow(list2)) {
p <- subset(list2,list2$id==i)
list2$tv_average[i==list2$id] <- sum(p$hours)/(nrow(p))
}
error: out of 22999 rows it only work on 21713 rows.
Upvotes: 3
Views: 142
Reputation: 32548
Try this
#Sample Data
data1 = structure(list(id = c(1L, 2L, 45L, 51L, 5675L), name = structure(c(3L,
1L, 5L, 4L, 2L), .Label = c("ashly", "eric", "jake", "james",
"lana"), class = "factor"), age = c(21L, 19L, 18L, 23L, 25L)
), .Names = c("id",
"name", "age"), row.names = c(NA, -5L), class = "data.frame")
data2 = structure(list(id = c(1L, 1L, 1L, 3L, 45L, 45L, 51L, 51L, 51L,
53L), hours = c(1.1, 3, 2.5, 10, 5.6, 3, 2, 1, 2, 6)), .Names = c("id",
"hours"), class = "data.frame", row.names = c(NA, -10L))
# Use aggregate to calculate Average, Sum, and Count and Merge
merge(x = data1,
y = aggregate(hours~id, data2, function(x)
c(mean = mean(x),
sum = sum(x),
count = length(x))),
by = "id",
all.x = TRUE)
# id name age hours.mean hours.sum hours.count
#1 1 jake 21 2.200000 6.600000 3.000000
#2 2 ashly 19 NA NA NA
#3 45 lana 18 4.300000 8.600000 2.000000
#4 51 james 23 1.666667 5.000000 3.000000
#5 5675 eric 25 NA NA NA
Upvotes: 2