Mace
Mace

Reputation: 1269

Apply function to table conditioned on several variables in Matlab

I have the following two tables, Data and Members:

Data = table(sort(repmat(datenum(2001,1:5,1).',4,1)),repmat(('A':'D').',5,1),repmat((201:204).',5,1),'VariableNames',{'Date','ID','Price'});

Data = 

       Date       ID    Price
    __________    __    _____

    7.3085e+05    A     201  
    7.3085e+05    B     202  
    7.3085e+05    C     203  
    7.3085e+05    D     204  
    7.3088e+05    A     201  
    7.3088e+05    B     202  
    7.3088e+05    C     203  
    7.3088e+05    D     204  
    7.3091e+05    A     201  
    7.3091e+05    B     202  
    7.3091e+05    C     203  
    7.3091e+05    D     204  
    7.3094e+05    A     201  
    7.3094e+05    B     202  
    7.3094e+05    C     203  
    7.3094e+05    D     204  
    7.3097e+05    A     201  
    7.3097e+05    B     202  
    7.3097e+05    C     203  
    7.3097e+05    D     204

Members = table(datenum(2001,1:5,1).',{cell2table({'B','C'});table({'A'});cell2table({'B','D'});cell2table({'A','C'});cell2table({'A','C'})},'VariableNames',{'Date','MemberID'});


Members = 

       Date        MemberID  
    __________    ___________

    7.3085e+05    [1x2 table]
    7.3088e+05    [1x1 table]
    7.3091e+05    [1x2 table]
    7.3094e+05    [1x2 table]
    7.3097e+05    [1x2 table]

Now, I would like to apply function, say mean, to the Price variable for each ´Date´ and only for the IDs in MemberId on each date. Thus, for the first date, 7.3085e+05, I would like the mean of ID 2 and 3 - i.e. (202+203)/2 - etc.

I am able to do this with a very slow for loop (my actual tables are much larger). I am thinking this should be possible using varfun or similar, but I can't get it to work. Any ideas?

UPDATE

Changed the IDs to text, since this is how my real data is.

Upvotes: 1

Views: 129

Answers (2)

Mace
Mace

Reputation: 1269

Using ismember I found the following solution myself:

% Defining the linear indexes according to Members for each date
AAA = arrayfun(@(n) ismember(table2array(Data(Data.Date==table2array(Members(n,'Date')),2)),table2array(Members.MemberID{n})),1:size(Members,1),'UniformOutput',false);

% Defining cell of Data for each date
BBB = arrayfun(@(n) Data(Data.Date==table2array(Members(n,'Date')),:),1:size(Members,1),'UniformOutput',false)

% Applying the linear index AAA on each data (BBB) and taking the average
CCC = arrayfun(@(n,x) mean(table2array(x{:}(n{:},3))),AAA,BBB)

CCC =

  202.5000  201.0000  203.0000  202.0000  202.0000

Upvotes: 0

Divakar
Divakar

Reputation: 221554

Approach 1

%// Find offsets to be used for linear indexing into elements for each group
off1 = [0 ; find(diff(Data.ID)<0)]

%// Store prices and member IDs as numeric arrays
prc = Data.Price
mem_id = Members.MemberId

%// Convert each MemberID table into arrays; add the corresponding
%// linear indexing offsets and finally index into prc and get the mean values
out = arrayfun(@(n) mean(prc(off1(n)+ char(table2array(mem_id{n}))-'A'+1 )),...
                                                            1:numel(mem_id))

Approach 2

%// Find offsets to be used for linear indexing into elements for each group
off1 = [0 ; find(diff(Data.ID)<0)]

%// Store prices and member IDs as numeric arrays
prc = Data.Price
mem_id = Members.MemberId

%// Storage for output
out = zeros(1,numel(mem_id))

%// Get the widths(lengths) of each table in MemberID and the unique lengths
lens = arrayfun(@(n) width(mem_id{n}),1:numel(mem_id))
unqlens = unique(lens,'stable')

%// Now, that you have groups of MemberID tables with uniform
%// widths/lengths, you can use table2array on all those tables with single
%// call within each group and thus could be beneficial in performance
for iter = 1:numel(unqlens)
    mask = lens==unqlens(iter); %// Mask for each group
    mem_id_iter = mem_id(mask); %// Member IDs
    grp_ids = reshape(char(table2array(vertcat(mem_id_iter{:}))) - 'A' + 1,[],...
                                                unqlens(iter)); %// group IDs
    lin_idx = bsxfun(@plus,off1(mask),grp_ids); %// linear indices
    out(mask) = mean(prc(lin_idx),2); %// index into prc and get mean values
end

Upvotes: 1

Related Questions