dede
dede

Reputation: 1171

Calculate the mean of some rows of a column in a table

I have a .csv file with a table, which I imported as following:

mydata = readtable('datafile1.csv');

The table has 2549 rows, and 28 columns. Here is one part of the table, with all columns but some rows, to give an example:

ID          subject A                   B   C                   D       E       F       G       H   I   J    K       L  M   N                   O    P      Q   R       S   T   U   V                   W   X   Y
'sbj05100'  'sbj05' 6.22316646575928    85  -2.31806182861328   339 14  100022  'tf'    48401   100 2   2   'no'    'h' 339 322.507000000000    339 'sbj05' 100 100021  286 1   419 1.95000000000000    2   1   662
'sbj05102'  'sbj05' 7.60787820816040    65  3.00547647476196    405 17  102012  'tf'    59201   102 1   2   'yes'   'h' 405 385.367000000000    405 'sbj05' 102 102011  283 1   283 1.89000000000000    1   1   364
'sbj05104'  'sbj05' -3.71897959709167   81  3.80262303352356    429 19  104012  'tf'    66401   104 1   2   'yes'   'h' 429 408.228000000000    429 'sbj05' 104 104011  266 1   266 2.19000000000000    2   1   244
'sbj09152'  'sbj09' 0.181026369333267   88  -0.0696721449494362 87  4   152042  'tf'    12401   152 4   2   'no'    'l' 87  82.8280000000000    87  'sbj09' 152 152041  297 1   297 1.25000000000000    1   1   354
'sbj09157'  'sbj09' 0.309507131576538   116 0.226024463772774   51  2   157042  'tf'    5201    157 4   2   'no'    'l' 51  48.4870000000000    51  'sbj09' 157 157041  273 1   273 1.45000000000000    1   1   279
'sbj10151'  'sbj10' 6.99367523193359    90  4.86872243881226    345 20  151022  'tf'    70001   151 2   2   'no'    'h' 345 328.224000000000    345 'sbj10' 151 151021  198 1   198 3                   1   1   310
'sbj10167'  'sbj10' 2.25431561470032    152 -0.200379326939583  129 7   167012  'tf'    23201   167 1   2   'yes'   'h' 129 122.675000000000    129 'sbj10' 167 167011  110 1   110 2.32000000000000    2   1   276
'sbj10168'  'sbj10' 3.22731518745422    147 4.72183227539062    93  3   168042  'tf'    8801    168 4   2   'no'    'l' 93  88.3230000000000    93  'sbj10' 168 168041  179 1   179 2.38000000000000    2   1   132

I need to calculate the mean of column B, and separately of column C, for each subject (column subject) and each condition (column I). What I would like to obtain is:

for sbj05 column B --> cond 1 = (65+81)/2
                       cond2 = 85
          column C --> cond 1 = (3.005476475+3.802623034)/2
                       cond2 = -2.3180618
and so on...

I tried to follow this link in matlab, calculate mean in a part of one column where another column satisfies a condition.

[R, I, J] = unique(mydata(:,2));
% count the repeating entries: now we have integer indices!
counts = accumarray(J, 1, size(R));
% sum the 2nd column for all entries
sums   = accumarray(J, mydata(:,4), size(R)); %for column B
% compute means
means  = sums./counts;

but I get this error:

Undefined function 'accumarray' for input arguments of type 'table'.

Any suggestions?

Upvotes: 2

Views: 594

Answers (1)

Jonas
Jonas

Reputation: 74940

Conveniently, Matlab has a function for calculalting statistics on tables. Instead of accumarray, you therefore may want to use grpstats:

meanPerSubjectAndCondition = grpstats(mydata,{'subject','I'},'mean','DataVars',{'B','C'})

Upvotes: 2

Related Questions