sge
sge

Reputation: 7640

Accumulate Array in Matlab

I have the following code snipped which works but is incredibly slow:

bps = [5; 10; 15; 20]
src = ['10.0.0.1'; '10.0.0.2'; '10.0.0.1'; '10.0.0.2']

uniqueSrc = unique(src);

sumBps = [];
for i=1:length(uniqueSrc)
    indy=find(ismember(src,uniqueSrc(i)));
    sumBps = [sumBps; sum(bps(indy))];
end

uniqueSrc = ['10.0.0.1'; '10.0.0.2']
sumBps = [20; 30]

src is a cell array containing IP adresses while one IP can occur multiple times. It is read from a file with textscan and %s. bps contains integers.

I need to sum up all integers in bps that belong to the same IP adress in src. The matching is according to the indices. So src(1) is the IP of bps(1) and so on.

The result should be a matching of the IPs to the sum of the corresponding bps values. So uniqueSrc(1) is the IP that has a sumBps(1) that is the sum of all bps values belonging to the certain IP.

While my code certainly works it is very unefficient as it seems and i wonder what would be the famous matlab one-liner to solve this problem.

Thanks in advance!

Edit: Added example input and output.

Upvotes: 1

Views: 269

Answers (2)

rayryeng
rayryeng

Reputation: 104503

Classic use of accumarray:

%// Your inputs
bps = [5; 10; 15; 20];
src = ['10.0.0.1'; '10.0.0.2'; '10.0.0.1'; '10.0.0.2'];

%// Relevant code
[uniqueSrc,~,id] = unique(cellstr(src));
sumBps = accumarray(id, bps);

The above code probably deserves some explanation. accumarray is a function that bins or groups data together based on IDs. As such, what I did was I converted src to a cell array with cellstr, then used unique to get a list of all of the unique IP addresses in addition to assigning a unique ID to each IP address. The unique IP addresses are stored in uniqueSrc which is the first output of unique and each IP address in src is assigned a unique ID stored in id. An intricacy that not many people think about is that unique not only finds unique entries in an array, but it also returns those values sorted. From this sorted result, the IDs assigned to each element in src would follow this ordering convention. Because you want to return those IP addresses in order (it looks like it), then we don't need to think about this part. Also, I needed to convert your IP addresses into a cell array for this to work.

Once we determine these, we use accumarray where the first element is an ID for each IP address and the second element is what value each IP address maps to. By default, accumarray bins those values that share the same ID and sums them all up. That pretty much describes what exactly it is you want to do.

The output I get for your desired variables is:

>> uniqueSrc

uniqueSrc = 

    '10.0.0.1'
    '10.0.0.2'

>> sumBps

sumBps =

    20
    30

Upvotes: 4

Buck Thorn
Buck Thorn

Reputation: 5073

Add my answer because I wanted to show how to catenate the IP addresses and summed bps, plus you don't need to convert to cell if you use the option rows with unique:

bps = [5; 10; 15; 20]
src = ['10.0.0.1'; '10.0.0.2'; '10.0.0.1'; '10.0.0.2'];

[strng ii jj]= unique(src,'rows');


strcat( {strng},{char(32)*ones(size(strng,1),1)},{num2str(accumarray(jj,bps))})

Upvotes: 1

Related Questions