anatolyg
anatolyg

Reputation: 28278

How to make a "reference" to a struct in Matlab?

In my program, I have a large (e.g. 100x100) array of structs, each struct having a fair amount of data (e.g. 1000 numbers, and some other fields). For example:

for x = 100 : -1 : 1
    for y = 100 : -1 : 1
        database(y,x).data = rand(30);
        database(y,x).name = sprintf('my %d %d', x, y);
    end
end

I would like to do a computation of 10-20 lines of code with my data; for example:

for x = 10 : 90
    for y = 10 : 90
        for dx = -9 : 9
            for dy = -9 : 9
                result = result + database(y + dy, x + dx).data(1, 1);
                result = result + 2 * database(y + dy, x + dx).data(1, 2) * database(y + dy, x + dx).data(2, 2);
                ... % more stuff here
            end
        end
    end
end

My code refers to current element of the database as database(y + dy, x + dx). To make it shorter, I give a name to it (C++ would call it "reference"):

temp = database(y + dy, x + dx);
result = result + temp.data(1, 1);
result = result + 2 * temp.data(1, 2) * temp.data(2, 2);

This makes my code much shorter and clearer. However, this is also much slower, and profiling shows that the assignment temp = ... takes 70% of my execution time.

So my assumption is that Matlab copies the contents of the largish database element, eating my time. I think Matlab should be smart enough to do "copy-on-write", that is, copy the stuff only when it is changed later. However, this is not what happens in my case - my code only reads from the database, and doesn't change it.

So, how can I make an efficient read-only reference to a struct?

Upvotes: 7

Views: 377

Answers (1)

Dennis Jaheruddin
Dennis Jaheruddin

Reputation: 21563

Well, there is definately copying going on when you do:

temp = database(y + dy, x + dx) 

This could be reduced perhaps by using:

temp = database(y + dy, x + dx).data

But obviously that would only work if you were just interested in the data in this part of the code.

That being said, I am not sure whether you can work around it without using inconvenient methods to structure your data. First of all you could benchmark your code after doing a replace all of temp by database(y + dy, x + dx) to assure that avoiding the copy will really help. If so you could try feeding database(y + dy, x + dx) to a subfunction, as typically variables in a subfunction are used with read acces if that is sufficient. However, I am not sure whether this also applies to parts of variables.

If none of the above helps, consider some of the oldest advice in the book:

For efficient calculations on big chunks of data, consider using matrices.

Upvotes: 1

Related Questions