Reputation: 21449
I am new to MATLAB, it wasn't in the job description and I've been forced to take over for the person who wrote and maintained the code my company uses. Life's tough.
The guy from which I'm taking over told me that he declared all the big data vectors as global
, to save memory. More specifically, so that when one function calls another function, he doesn't create a copy of the data when he passes it over.
Is this true? I read Strategies for Efficient Use of Memory, and it says that
When working with large data sets, be aware that MATLAB makes a temporary copy of an input variable if the called function modifies its value. This temporarily doubles the memory required to store the array, which causes MATLAB to generate an error if sufficient memory is not available.
It says something very similiar in Memory Allocation For Array #Function Arguments:
When you pass a variable to a function, you are actually passing a reference to the data that the variable represents. As long as the input data is not modified by the function being called, the variable in the calling function and the variable in the called function point to the same location in memory. If the called function modifies the value of the input data, then MATLAB makes a copy of the original array in a new location in memory, updates that copy with the modified value, and points the input variable in the called function to this new array.
So is it true that using global
can be better? It seems a little sloppy to blithely declare all the large data as global
, instead of making sure that none of the code modifies its input argument. Am I wrong? Does this really improve RAM usage?
Upvotes: 10
Views: 4287
Reputation: 125854
This answer may be somewhat tangential, but an additional topic that bears mention here is the use of nested functions to manage memory.
As has already been established in other answers, there is no need for global
variables if the data you are passing to the function is not modified (since it will be passed by reference). If it is modified (and is thus passed by value), using a global
variable instead will save you memory. However, global
variables can be somewhat "uncouth" for the following reasons:
global varName
everywhere you need them.clear global
, which clears all global variables.An alternative to global
variables was mentioned in the first set of documentation you cited: nested functions. Immediately following the quote you cited is a code example (which I've formatted slightly differently here):
function myfun
A = magic(500);
setrowval(400, 0);
disp('The new value of A(399:401,1:10) is')
A(399:401,1:10)
function setrowval(row, value)
A(row,:) = value;
end
end
In this example, the function setrowval
is nested inside the function myfun
. The variable A
in the workspace of myfun
is accessible within setrowval
(as if it had been declared global
in each). The nested function modifies this shared variable, thus avoiding any additional memory allocation. You don't have to worry about the user inadvertently clearing anything and (in my opinion) it's a bit cleaner and easier to follow than declaring global
variables.
Upvotes: 4
Reputation: 5714
I think you pretty much answered your own question, but a couple more references would be good here:
I made a video on this:
http://blogs.mathworks.com/videos/2008/09/16/new-location-and-memory-allocation/
Similar to what Loren spoke of here:
http://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/
-Dogu
Upvotes: 3
Reputation: 2006
The solution seems a bit strange to me. As you found out already, it shouldn't have significant impact on the memory usage if the called function does not modify the data array. However, if the called function modifies the data array, there's a functional difference: In one case (making the data array global), the change has an impact on the rest of the code, in the other case (passing it as reference) the modifications are only local and temporary.
Upvotes: 3
Reputation: 4532
In my experience, provided that none of the code modifies the large data, memory usage is the same, regardless of whether you use a global variable or an input argument, just like the Matlab docs say. Further information is in this blog post by a MathWorks employee.
There is quite a bit of folklore on performance issues in Matlab and not all of it is right. The internals of Matlab have changed quite a bit. It may be that in a previous version it's better to use a global variable.
Upvotes: 6