Reputation: 2246
As I am wont to do, I'm keeping tabs on my cats using matrices.
catWeights <- cbind(fluffy=c(5.0,5.1,5.2,5.3),misterCuddles=c(1.2,1.3,1.4,1.5),captainMew=c(4.3,4.2,4.1,4.0))
catTypes <- cbind(fluffy=c('cat','cat','cat','cat'),misterCuddles=c('kitten','kitten','kitten','cat'),captainMew=c('cat','cat','cat','cat'))
dates <- c("2013-01-01", "2013-01-02", "2013-01-03","2013-01-04")
row.names(catWeights) <- dates
row.names(catTypes) <- dates
On any date, I know how much each of them weigh:
> catWeights
fluffy misterCuddles captainMew
2013-01-01 5.0 1.2 4.3
2013-01-02 5.1 1.3 4.2
2013-01-03 5.2 1.4 4.1
2013-01-04 5.3 1.5 4.0
And I know whether they're cats or kittens:
> catTypes
fluffy misterCuddles captainMew
2013-01-01 "cat" "kitten" "cat"
2013-01-02 "cat" "kitten" "cat"
2013-01-03 "cat" "kitten" "cat"
2013-01-04 "cat" "cat" "cat"
How can I tell how much all my cats and all my kittens weigh through time?
I want this:
> totalWeights
cat kitten
2013-01-01 9.3 1.2
2013-01-02 9.3 1.3
2013-01-03 9.3 1.4
2013-01-04 10.8 0.0
On the fourth of January, Mister Cuddles turned 1, so he was no longer a kitten. His weight moved from the kitten bucket to the cat bucket.
Upvotes: 1
Views: 70
Reputation: 2246
Microbenchmarking alexis_laz's two solutions on a 2500x2500 matrix with 10 groups:
> microbenchmark(cbindLapply(), sapplyOnly(), times=100)
Unit: milliseconds
expr min lq median uq max neval
cbindLapply() 841.4796 865.2220 879.9099 892.6265 990.5915 100
sapplyOnly() 846.3675 869.7372 879.0286 901.3314 979.6136 100
Upvotes: 0
Reputation: 13122
This seems valid using the sample data:
do.call(cbind,
lapply(c("cat", "kitten"),
function(x) rowSums(catWeights * (catTypes == x))))
# [,1] [,2]
#2013-01-01 9.3 1.2
#2013-01-02 9.3 1.3
#2013-01-03 9.3 1.4
#2013-01-04 10.8 0.0
EDIT:
As @BlueMagister commented ... lapply(unique(as.vector(catTypes)), ...
is the more general form of the answer. I guess, though, you've already found a way to overcome this, since you accepted the answer. The as.vector
is because unique
has a matrix
method that is not convenient in this specific case.
Also, since I 'm in editing mode, I'll note that sapply
could 've been used, but based on some rough benchmarks I 've made from time to time, I 've found lapply
to be faster even if it is accompanied by a do.call(r/cbind, ..)
or a unlist
. I did not test it for a larger dataset in this specific case, though.
So, another format of the answer could've been:
sapply(unique(as.vector(catTypes)),
function(x) rowSums(catWeights * (catTypes == x)))
Upvotes: 4
Reputation: 8717
Here is a not very general answer that applies only to the example data set.
# Construct matrices for the cat weights and kitten weights
catWts <- ifelse(catTypes=="cat", catWeights[catTypes=="cat"], 0)
kittenWts <- ifelse(catTypes=="kitten", catWeights[catTypes=="kitten"], 0)
# Well, then just take the row sums for the two matrices
catSums <- rowSums(catWts)
kittenSums <- rowSums(kittenWts)
# Then combine it to a data frame
totalWeights <- data.frame(cat=catSums, kitten=kittenSums)
# In one line
data.frame(cat=rowSums(ifelse(catTypes=="cat", catWeights[catTypes=="cat"], 0)),
kitten=rowSums(ifelse(catTypes=="kitten", catWeights[catTypes=="kitten"], 0)))
# cat kitten
#2013-01-01 9.0 1.3
#2013-01-02 10.1 1.4
#2013-01-03 10.3 1.2
#2013-01-04 14.6 0.0
I would imagine that there is a more general approach to solving this problem.
Upvotes: 0