shamalaia
shamalaia

Reputation: 2351

matlab code slowing down while loading netcdf data

The code:

I have to make some plots of the data contained in a big netcdf file with size 1080*171*52*120 (longitude,latitude,depth,time). My strategy it to load the data by 12 chunks in the first and 36 in the fourth dimensions (see script at the end).

So, I have a for loop on the 12 spatial chunks that contains a for loop on the 36 time chunks. At each iteration on the time chunks, I store the information that I need in a matrix V and then do the plot.

The problem:

The speed of the code seems to slow down after few iterations. At which iteration it happens is not constant... typically 5 or 8. The slowing down does not seem to be due to a memory problem because the memory allocated is constant after the first iteration. The problem seems due to the loading of the data (via ncread). However, the size of the data that I upload is constant for all the iterations...

Here' the output of memory before the slowing down together with the elapsed time for loading the files:

Maximum possible array: 23186 MB (2.431e+10 bytes) * Memory available for all arrays: 23186 MB (2.431e+10 bytes) * Memory used by MATLAB: 1226 MB (1.286e+09 bytes) Physical Memory (RAM): 16296 MB (1.709e+10 bytes)

  • Limited by System Memory (physical + swap file) available. Elapsed time is 0.384304 seconds.

while here the same output after the slowing down:

Maximum possible array: 23186 MB (2.431e+10 bytes) * Memory available for all arrays: 23186 MB (2.431e+10 bytes) * Memory used by MATLAB: 1226 MB (1.286e+09 bytes) Physical Memory (RAM): 16296 MB (1.709e+10 bytes)

  • Limited by System Memory (physical + swap file) available. Elapsed time is 26.155781 seconds.

To confuse thing further...

Every time chunk is made of 5 time points. The indices of these time points are stored 5 by 5 in each cell of a cell array (new_inds) that has 120/5 elements. However, if I limit the length of this cell array to 8 (or 8*5 time points) the code does not slow down.

In fact, the loading time in this last case is actually constant throughout the execution:

Maximum possible array: 23068 MB (2.419e+10 bytes) * Memory available for all arrays: 23068 MB (2.419e+10 bytes) * Memory used by MATLAB: 1231 MB (1.291e+09 bytes) Physical Memory (RAM): 16296 MB (1.709e+10 bytes)

  • Limited by System Memory (physical + swap file) available. Elapsed time is 0.390843 seconds.

Please note that in this last case the size of chunks to be uploaded is the same as before. So, even if I obviously expected the code to run faster overall (as there are fewer time points in total), I do not understand why the loading works well in this last case.

The question:

The only thing that changes between the two cases is the length of the cell array containing the time indices. But I really doubt that this can be the reason of the slowing down... Also because in the first case matlab gets lazy after a not constant number of iteration.

Can someone understand the reason for it?

Script:

TT=length(new_inds);%how many time steps to average at each iteration

%%%plotting by chunk
figure;hold on

for kkk=1:numel(chunks)-1   %iteration on space
V=0;   %matrix to update after each iteration on time    

for ttt=1:TT   %iteration on time

fprintf(['LOADING -',num2str(ttt),'/',num2str(TT),'- STARTED!\r'])

s=chunks(kkk+1)-chunks(kkk) %size in space of the current chunk
t=new_inds{ttt}(end)-new_inds{ttt}(1)+1  %size in time of the current chunk
memory 
tic 
%load the velocitites on the current chunk
Uvel=ncread(uvelnc,'Uvel',[chunks(kkk) 1 1 new_inds{ttt}(1)],...
    [chunks(kkk+1)-chunks(kkk)+2 y2 Inf new_inds{ttt}(end)-new_inds{ttt}(1)+1]);

Vvel=ncread(vvelnc,'Vvel',[chunks(kkk) 1 1 new_inds{ttt}(1)],...
    [chunks(kkk+1)-chunks(kkk)+2 y2 Inf new_inds{ttt}(end)-new_inds{ttt}(1)+1]);

Uvel(Uvel==-9999)=NaN;
Vvel(Vvel==-9999)=NaN;
toc

fprintf(['LOADING -',num2str(ttt),'/',num2str(TT),'- DONE!\r'])


%%%velocity module
Uvel= nansum(Uvel,4);
Vvel= nansum(Vvel,4);

[Xv,Yv,Zv]=ndgrid(lonMv(chunks(kkk):chunks(kkk+1)-1+2),latMv(1,:),uv_levs);
[Xu,Yu,Zu]=ndgrid(lonMu(chunks(kkk):chunks(kkk+1)-1+2),latMu(1,:),uv_levs);
Vvel=interpn(Xv,Yv,Zv,Vvel,Xu,Yu,Zu,'linear',NaN);

V = V + sqrt(Uvel.^2+Vvel.^2); %update the matrix V
end %of the iteration on time


V_t=V/sum(cellfun('length',new_inds)); %average V on time

%%%plotting
V_t=V_t(:,:,1); %surface values

lonP=lonMu(chunks(kkk):chunks(kkk+1)-1+2,:);
lonP(V_t==0)=NaN;
latP=latMu(chunks(kkk):chunks(kkk+1)-1+2,:);
m_pcolor(lonP,latP,V_t); 

end %iteration on space

Upvotes: 2

Views: 328

Answers (1)

shun_3568
shun_3568

Reputation: 11

Use a low-level function netcdf.getVar might help

Upvotes: 1

Related Questions