user3557054
user3557054

Reputation: 219

Count repeated combinations in Matlab

Let us consider the following situation, for instance we have cell-matrix A:

A:
 A(:,1)A(:,2) A(:,3)   A(:,4)    A(:,5)        
'CM'  '00118''000151' '19970303''19970729'
'RO'  '01356''043605' '19970212''19970401' 
'CM'  '01464''000151' '19970121''19970218'
'RO'  '01356''043605' '19970119''19970313' 
'CM'  '00118''043605' '19970114''19970219' 
'CM'  '00118''000151' '19970523''19970728' 

In the first three columns, both strings and values are combined and repeated along the matrix.

I would liketo create a new column,

REP = num2cell(REP);
A(:,end+1) = REP;

that will correspond to the number of repetitions of A(:,3) for each matching A(:,1). For instance the output for this example would be:

Output for A:
A(:,1) A(:,2) A(:,3)   A(:,4)    A(:,5)   REP        
'CM'  '00118''000151' '19970303''19970729''3'
'RO'  '01356''043605' '19970212''19970401''2'   
'CM'  '01464''000151' '19970121''19970218''3' 
'RO'  '01356''043605' '19970119''19970313''2' 
'CM'  '00118''043605' '19970114''19970219''1' 
'CM'  '00118''000151' '19970523''19970728''3'

Upvotes: 1

Views: 85

Answers (2)

Luis Mendo
Luis Mendo

Reputation: 112749

For each of the two columns, use unique to transform strings into unique numeric labels. Then count repetitions for each combination of labels with sparse (accumarray could also be used):

[~, ~, uu1 ] = unique(A(:,1)); %// get unique labels for column 1
[~, ~, uu3 ] = unique(A(:,3)); %// get unique labels for column 3
R = full(sparse(uu1, uu3, 1)); %// count repetitions of each combination
REP = R(sub2ind(size(R), uu1, uu3)); %// result as a column vector
A(:,end+1) = mat2cell(num2str(REP), ones(1,numel(REP))); %// attach result to A

In your example, assuming the input is

A = {'CM'    '00118'    '000151'    '19970303'    '19970729'    
     'RO'    '01356'    '043605'    '19970212'    '19970401'    
     'CM'    '01464'    '000151'    '19970121'    '19970218'    
     'RO'    '01356'    '043605'    '19970119'    '19970313'    
     'CM'    '00118'    '043605'    '19970114'    '19970219'    
     'CM'    '00118'    '000151'    '19970523'    '19970728'};

the code produces

REP =
     3
     2
     3
     2
     1
     3

A = 
    'CM'    '00118'    '000151'    '19970303'    '19970729'    '3'
    'RO'    '01356'    '043605'    '19970212'    '19970401'    '2'
    'CM'    '01464'    '000151'    '19970121'    '19970218'    '3'
    'RO'    '01356'    '043605'    '19970119'    '19970313'    '2'
    'CM'    '00118'    '043605'    '19970114'    '19970219'    '1'
    'CM'    '00118'    '000151'    '19970523'    '19970728'    '3'

Upvotes: 1

Benoit_11
Benoit_11

Reputation: 13945

What if you try this:

    clear all
    clc

    A = cell(6,6);
    % I replicate only the relevant part of your cell array.
    A(:,1) = {'CM' 'RO' 'CM' 'RO' 'CM' 'CM'};
    A(:,3) = {'000151' '043605' '000151' '043605' '043605' '000151'}; 


    % Concatenate strings to easily compare them afterwards.
    ConcatString = cell(size(A,1),1);
    for i = 1:size(A,1)    
        ConcatString{i} = strcat(A{i,1},num2str(A{i,3}));    
    end

    REP = zeros(size(A,1),1);

   for i = 1:size(A,1)   

   Match = strcmp(ConcatString{i}, ConcatString) % Get logical array, '1' represents match.   

   NumRep = size(find(Match ==1),1) % Number of repetitions

   REP(Match ==1) = NumRep % Enter # of repetitions in REP

   end
    A(:,6) = num2cell(REP);

    disp(A)

which gives me this:

'CM'    []    '000151'    []    []    [3]
'RO'    []    '043605'    []    []    [2]
'CM'    []    '000151'    []    []    [3]
'RO'    []    '043605'    []    []    [2]
'CM'    []    '043605'    []    []    [1]
'CM'    []    '000151'    []    []    [3]

it's a bit crude but you can fill the blanks with your data I guess.

Upvotes: 1

Related Questions