Reputation: 339
I (mostly) have a prototype script to achieve what I want, but I'm not programmer (yet), and what I wrote is very cumbersome. I could use some help fitting this into a package that is amenable to something more than 10 bins (see below). While we're at it, I also would love to know how to assign different colors to each series.
Briefly, I've got a (n,2) matrix --where n is 20,000 to 40,000) that consists of data for two variables. Typically, I make a scatterplot (or density plot) with each variable on an axis. Now, I want to slice up the data (err, divide the data into bins) along the x axis and plot a histogram for the y values in each bin. I then plot all the histograms for each of the bins on the same plot (preferably in different colors) to see more clearly how the distributions change as X changes.
NOTE: 1) the data is set on a log scale, hence logspace bins. 2) for the sake of argument, pretend that logicleHist is just a regular hist function.
EXAMPLE
%DensPlot Slicer
data=[BFP GFP];
dp_bins=10;
dp_bounds=logspace(1,5,dp_bins);
%bins
b1=data(data(:,1) >= dp_bounds(1) & data(:,1) < dp_bounds(2),:);
b2=data(data(:,1) >= dp_bounds(2) & data(:,1) < dp_bounds(3),:);
b3=data(data(:,1) >= dp_bounds(3) & data(:,1) < dp_bounds(4),:);
b4=data(data(:,1) >= dp_bounds(4) & data(:,1) < dp_bounds(5),:);
b5=data(data(:,1) >= dp_bounds(5) & data(:,1) < dp_bounds(6),:);
b6=data(data(:,1) >= dp_bounds(6) & data(:,1) < dp_bounds(7),:);
b7=data(data(:,1) >= dp_bounds(7) & data(:,1) < dp_bounds(8),:);
b8=data(data(:,1) >= dp_bounds(8) & data(:,1) < dp_bounds(9),:);
b9=data(data(:,1) >= dp_bounds(9) & data(:,1) < dp_bounds(10),:);
figure;
hold on
logicleHist(b1(:,2));
logicleHist(b2(:,2));
logicleHist(b3(:,2));
logicleHist(b4(:,2));
logicleHist(b5(:,2));
logicleHist(b6(:,2));
logicleHist(b7(:,2));
logicleHist(b8(:,2));
logicleHist(b9(:,2));
Suggestions? Thanks!
Upvotes: 1
Views: 544
Reputation: 1362
If I understood your question right, you want to histogram y's (or data(:,2)
) that correspond to 10 bins of x (or data(:,1)
). Please see the code below and refer to commented code and SO for further explanation on the code.
% The following are custom-created to make the code self-contained, replace with
% your data and bounds.
data(:,1)=rand(100,1);
data(:,2)=rand(100,1);
dp_bounds=logspace(min(data(:,1)),max(data(:,1)),10);
data(:,1)=10.^rand(100,1);
figure('Position',[10 10 800 750],'Color','w');
bar_color=colormap;
bar_color=bar_color(linspace(1,size(colormap,1),numel(dp_bounds)),:); % Select colors per bar
for ii=1:numel(dp_bounds)-1
sel_data=data(data(:,1) >= dp_bounds(ii) & data(:,1) < dp_bounds(ii+1),2);
subplot(numel(dp_bounds)-1,1,ii);
[h,bins_y]=hist(sel_data);
bar(bins_y,h,'FaceColor', bar_color(ii,:)); % Bar plot with y histograms (auto bins for y)
title(['x from ',num2str(dp_bounds(ii)),' to ',num2str(dp_bounds(ii+1))],'FontSize', 12)
end
If you copy and paste the code above to the Matlab prompt, you should see something similar to the following figure.
Update: the code above was tested on Matlab 2010. If using the 2014 version, you may have to replace:
[h,bins_y]=hist(sel_data);
bar(bins_y,h,'FaceColor', bar_color(ii,:));
with histogram(sel_data,'FaceColor', bar_color(ii,:))
(note the lack of a semi-colon) as observed in another solution.
Upvotes: 1
Reputation: 1182
The first step might be to use a for loop. Replace everything in your code after
%bins
with
figure
hold on
for i = 1:(dp_bins-1)
b = data(data(:,1)>=dp_bounds(i) & data(:,2)<=dp_bounds(i+1),:)
hist(b(:,2))
end
where b
is playing the role of your b1
, b2
, ... in turn. Note histogram
is the currently used function in the latest release of Matlab. I only have hist
myself.
Note that you can assign the second index to b
in a single statement. I would normally write
b = data(data(:,1)>=dp_bounds(i) & data(:,2)<=dp_bounds(i+1),2)
histogram(b)
If you want to overlay so many histograms, I think the plot will get very hard to read no matter what you do with the colors. It's also quite difficult to control the histogram colors with hist
. I'd suggest using stem plots, rather than histograms, for each of the b
s. This would require another manual binning step over each b
, which you could accomplish with a nested for
loop.
Upvotes: 1