Reputation: 213
I have a data file of two columns ('date' and 'user'):
date1 user1
date1 user1
date1 user2
date2 user1
date2 user2
...
I need to find how many times each unique user did the action at the certain date. I know I could use unique() function to find the whole distinct users and after to run the for loop through all the rows, check if equal and then sum up, but the problem is that I have more than 8mln rows and to run double loop it would be too expensive.
Is there any other way to count occurrences of date for each user?
Upvotes: 2
Views: 2598
Reputation: 19880
Please always state what are the data types you are dealing with, better with an example.
I assume both users and dates are strings combined in a cell array.
tbl = { 'date1' 'user1'
'date1' 'user1'
'date1' 'user2'
'date2' 'user1'
'date2' 'user2' };
Combine 2 columns into one:
user_date = strcat(tbl(:,2),'@',tbl(:,1));
Then you can count occurences:
[gi,g] = grp2idx(user_date);
n = histc(gi,1:numel(g));
g =
'user1@date1'
'user2@date1'
'user1@date2'
'user2@date2'
n =
2
1
1
1
Note MATLAB Statistics Toolbox is required for grp2idx
Upvotes: 2
Reputation: 83457
If I understand the question correctly, check out the following code:
% Your data
A = [1 9; 8 5; 5 9; 8 5; 9 9];
date = 8;
user = 5;
% Find how many times each unique user did the action at the certain date
nb_occurences = size(A, 1) - size((setdiff(A, [date user], 'rows')), 1);
Cache size(A, 1) is needed (not sure if MATLAB will do that for you).
Upvotes: 0