Star
Star

Reputation: 2299

Memory-speed issues when doing a scatter plot in Matlab

I have the following memory-speed problem in Matlab and I would like your help to understand whether there may be a solution.

Consider the following 4 big column vectors X1, X2, Y1, Y2.

clear 
rng default
P=10^8;
X1=rand(1,P)*5;
X2=rand(1,P)*5;
Y1=rand(1,P)*5;
Y2=rand(1,P)*5;

What I would like to do is a scatter plot where on the x-axis I have the sum between any possible two elements of X1 and X2 and on the y-axis I have the sum between any possible two elements of Y1 and Y2.

I post here three options I thought about that do not work mainly because of memory and speed issues.

Option 1 (issues: too slow when doing the loop, out of memory when doing vertcat)

Xtemp=cell(P,1);
Ytemp=cell(P,1);
for i=1:P
    tic
    Xtemp{i}=X1(i)+X2(:);
    Ytemp{i}=Y1(i)+Y2(:);
    toc
end
X=vertcat(Xtemp{:}); 
Y=vertcat(Ytemp{:});
scatter(X,Y)

Option 2 (issues: too slow when doing the loop, time increasing as the loop proceeds, Matlab going crazy and unable to produce the scatter even if I stop the loop after 5 iterations)

for i=1:P
    tic
    scatter(X1(i)+X2(:), Y1(i)+Y2(:))
    hold on 
    toc
end

Option 3 (sort of giving up) (issues: as I increase T the scatter gets closer and closer to a square which is correct; I am wondering though whether this is caused by the fact that I generated the data using rand and in option 3 I use randi; maybe with my real data the scatter does not "converge" to the true plot as I increase T; also, what is the "optimal" T and R?).

T=20;
R=500;
for t=1:T
    tic
    %select R points at random from X1,X2,Y1,Y2 
    X1sel=(X1(randi(R,R,1)));
    X2sel=(X2(randi(R,R,1)));
    Y1sel=(Y1(randi(R,R,1)));
    Y2sel=(Y2(randi(R,R,1)));
    %do option 1 among those points and plot
    Xtempsel=cell(R,1);
    Ytempsel=cell(R,1);
    for r=1:R
        Xtempsel{r}=X1sel(r)+X2sel(:);
        Ytempsel{r}=Y1sel(r)+Y2sel(:);
    end
    Xsel=vertcat(Xtempsel{:}); 
    Ysel=vertcat(Ytempsel{:});
    scatter(Xsel,Ysel, 'b', 'filled')
    hold on
    toc
end

Is there a way to do what I want or is simply impossible?

Upvotes: 1

Views: 62

Answers (1)

Brice
Brice

Reputation: 1580

You are trying to build a vector with P^2 elements, i.e. 10^16. This is many order of magnitude more that what would fit into the memory of a standard computer (10GB is 10^10 bytes or 1.2 billion double precision floats).

For smaller vectors (i.e. P<1e4), try:

Xsum=bsxfun(@plus,X1,X2.'); %Matrix with the sum of any two elements from X1 and X2
X=X(:);                     %Reshape to vector
Ysum=bsxfun(@plus,Y1,Y2.');
Y=Y(:);
plot(X,Y,'.') %Plot as small dots, likely to take forever if there are too many points

To build a figure with a more reasonable number of pairs picked randomly from these large vectors:

Npick=1e4;
sel1=randi(P,[Npick,1]);
sel2=randi(P,[Npick,1]);
Xsel=X1(sel1)+X2(sel2);
Ysel=Y1(sel1)+Y2(sel2);
plot(Xsel,Ysel,'.');     %Plot as small dots

Upvotes: 2

Related Questions