Reputation: 679
I have a matrix with individuals with a household id in the first column and their age in the second column:
1 32
1 36
1 8
2 50
2 55
3 45
3 40
3 4
3 5
4 23
What I want to do is to find the number of persons between certain age classes with a specific household id. An example of some of the age classes are seen below:
0
1-2
3
4-6
7-10
etc
So if I search household id 1
I want to find 1
in class 7-10 and when looking for household id 3
I want to find 2
in class 4-6. How can I do this?
Upvotes: 0
Views: 77
Reputation: 35525
histcounts
counts the histogram, and you can input edges
to it (the edges are the limits of each bin);
Full example on how to do this in 2 lines using your data, and random ranges. The only thing you need to know is that whenever you want a single number as a bin, you just add the edges of the bin very very close to the value itself. In your case they are integers, so -0.1 +0.1 is enough.
M=[1 32
1 36
1 8
2 50
2 55
3 45
3 40
3 4
3 5
4 23];
ranges=[-0.1 0.1 1 4 20 30 40]; % random
% separate the matrix into cells with unique IDs
Mcell = arrayfun(@(x) M(M(:,1) == x, :), unique(M(:,1)), 'uniformoutput', false);
% Count bins in each cell
result=cellfun(@(x)(histcounts(x(:,2),ranges)),Mcell,'uniformoutput',false)
Upvotes: 1
Reputation: 487
What you are looking for is surely a histogram. However, I could not find a build-in function in MATLAB that supports zero-width bins as in your question. Therefore, I wrote a little script on how to compute the desired histogram manually:
% the age data
% (first column: id
% second column: age)
mat_data = [1 32
1 36
1 8
2 50
2 55
3 45
3 40
3 4
3 5
4 23];
% the age classes
% (one class per row,
% lower bound in first column,
% upper bound in second column)
age_classes = [0, 0; ...
1, 2; ...
3, 3; ...
4, 6; ...
7, 10];
% determine all ids
ids = unique(mat_data(:, 1));
% initialize the output matrix
mat_counts = zeros(size(age_classes, 1), length(ids));
% count the number of entries for each id and age class
for idx_id = 1 : length(ids)
cur_data = mat_data(mat_data(:, 1) == ids(idx_id), 2);
for idx_age_class = 1 : length(age_classes)
cur_age_class = age_classes(idx_age_class, :);
mat_counts(idx_age_class, idx_id) = length(find(cur_data >= cur_age_class(1) & cur_data <= cur_age_class(2)));
end
end
% plot everything
% create the x-axis labels
xticks = {};
for idx_age_class = 1 : length(age_classes)
if length(unique(age_classes(idx_age_class, :))) == 1
xticks{idx_age_class} = num2str(age_classes(idx_age_class, 1));
else
xticks{idx_age_class} = sprintf('%.0f-%.0f', age_classes(idx_age_class, 1), age_classes(idx_age_class, 2));
end
end
figure(1);
bar(mat_counts);
set(gca, 'xticklabel', xticks);
legend(cellfun(@num2str, num2cell(ids)))
xlabel('age class')
ylabel('count')
Does this solve your problem?
Upvotes: 1