vincentqu
vincentqu

Reputation: 367

daply: Correct results, but confusing structure

I have a data.frame mydf, that contains data from 27 subjects. There are two predictors, congruent (2 levels) and offset (5 levels), so overall there are 10 conditions. Each of the 27 subjects was tested 20 times under each condition, resulting in a total of 10*27*20 = 5400 observations. RT is the response variable. The structure looks like this:

> str(mydf)
'data.frame':   5400 obs. of  4 variables:
 $ subject  : Factor w/ 27 levels "1","2","3","5",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ congruent: logi  TRUE FALSE FALSE TRUE FALSE TRUE ...
 $ offset   : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 5 5 1 2 5 5 2 2 3 5 ...
 $ RT       : int  330 343 457 436 302 311 595 330 338 374 ...

I've used daply() to calculate the mean RT of each subject in each of the 10 conditions:

myarray <- daply(mydf, .(subject, congruent, offset), summarize, mean = mean(RT))

The result looks just the way I wanted, i.e. a 3d-array; so to speak 5 tables (one for each offset condition) that show the mean of each subject in the congruent=FALSE vs. the congruent=TRUE condition.

However if I check the structure of myarray, I get a confusing output:

List of 270
 $ : num 417
 $ : num 393
 $ : num 364
 $ : num 399
 $ : num 374
 ... 
 # and so on
 ...
 [list output truncated]
 - attr(*, "dim")= int [1:3] 27 2 5
 - attr(*, "dimnames")=List of 3
  ..$ subject  : chr [1:27] "1" "2" "3" "5" ...
  ..$ congruent: chr [1:2] "FALSE" "TRUE"
  ..$ offset   : chr [1:5] "1" "2" "3" "4" ...

This looks totally different from the structure of the prototypical ozone array from the plyr package, even though it's a very similar format (3 dimensions, only numerical values).

I want to compute some further summarizing information on this array, by means of aaply. Precisely, I want to calculate the difference between the congruent and the incongruent means for each subject and offset.

However, already the most basic application of aaply() like aaply(myarray,2,mean) returns non-sense output:

FALSE  TRUE 
   NA    NA 
Warning messages:
1: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(piece, ...) :
  argument is not numeric or logical: returning NA

I have no idea, why the daply() function returns such weirdly structured output and thereby prevents any further use of aaply. Any kind of help is kindly appreciated, I frankly admit that I have hardly any experience with the plyr package.

Upvotes: 1

Views: 94

Answers (1)

alexwhan
alexwhan

Reputation: 16026

Since you haven't included your data it's hard to know for sure, but I tried to make a dummy set off your str(). You can do what you want (I'm guessing) with two uses of ddply. First the means, then the difference of the means.

#Make dummy data
mydf <- data.frame(subject = rep(1:5, each = 150), 
  congruent = rep(c(TRUE, FALSE), each = 75), 
  offset = rep(1:5, each = 15), RT = sample(300:500, 750, replace = T))

#Make means
mydf.mean <- ddply(mydf, .(subject, congruent, offset), summarise, mean.RT = mean(RT))

#Calculate difference between congruent and incongruent
mydf.diff <- ddply(mydf.mean, .(subject, offset), summarise, diff.mean = diff(mean.RT))
head(mydf.diff)
#   subject offset  diff.mean
# 1       1      1  39.133333
# 2       1      2   9.200000
# 3       1      3  20.933333
# 4       1      4  -1.533333
# 5       1      5 -34.266667
# 6       2      1  -2.800000

Upvotes: 1

Related Questions