Calculating group mean/medians in MATLAB where group ID is in a separate column

I have one column which contains the group ID of each participant. There are three groups so every number in this column is 1, 2 or 3.

Then I have a second column which contains response scores for each participant. I want to calculate the mean/median response score within each group.

I have managed to do this by looping through every row but I sense this is a slow and suboptimal solution. Could someone please suggest a better way of doing things?

Upvotes: 3

Views: 8000

Answers (3)

2dvisio
2dvisio

Reputation: 911

grpstats is a good function to be used ( documentation here )

This is a list of embedded statistics:

  • 'mean' Mean
  • 'sem' Standard error of the mean
  • 'numel' Count, or number, of non-NaN elements
  • 'gname' Group name
  • 'std' Standard deviation
  • 'var' Variance
  • 'min' Minimum
  • 'max' Maximum
  • 'range' Range
  • 'meanci' 95% confidence interval for the mean
  • 'predci' 95% prediction interval for a new observation

and it accepts as well function handles ( Ex: @mean, @skeweness)

>> groups = [1 1 1 2 2 2 3 3 3]';
>> data   = [0 0 1 0 1 1 1 1 1]';
>> grpstats(data, groups, {'mean'})
ans =

    0.3333
    0.6667
    1.0000

>> [mea, med] = grpstats(data, groups, {'mean', @median})

mea =

    0.3333
    0.6667
    1.0000


med =

     0
     1
     1

Upvotes: 4

tmpearce
tmpearce

Reputation: 12693

This is a good place to use accumarray (documentation and blog post):

result = accumarray(groupIDs, data, [], @median);

You can of course give a row or column of a matrix instead of a variable called groupIDs and another for data. If you'd prefer the mean instead of the median, use @mean as the 4th arg.


Note: the documentation notes that you should sort the input parameters if you need to rely on the order of the output. I'll leave that exercise for another day though.

Upvotes: 2

bla
bla

Reputation: 26069

Use logic conditions, for example say your data is in matrix m as follows: the first col is ID the second col is the response scores,

mean(m(m(:,1)==1,2))
median(m(m(:,1)==1,2))

will give you the mean and median for 1 in the response score, etc

Upvotes: 1

Related Questions