Reputation: 661
I have a data table with the following format representing relations strength between countries for many categories of relations:
Country1 Country2 Value Category
A A 4 1
A B 2 1
A C 9 1
B A 3 2
B D 4 1
C A 2 2
D C 7 2
...
Now I would like to sum all mutual relations (e.g. A-B and B-A; D-C and C-D, etc...) for each category (A-B and B-A needs to be "merged").
What could be a concise and "very R" solution to do it? Is there any existing function that could do this?
For now, I've set a key with both "Country1" and "Country2" columns but didn't find what I could do next for matching corresponding rows..
Thanks for any clue.
Upvotes: 1
Views: 1242
Reputation: 3656
# x = your data as data.table
x[, rel :=paste(fromCountry, toCountry, sep = "-")]
x[, .(sums = sum(Value)), by = rel]
# if fromCountry - toCountry is considered to be bidirectional, then make sets:
library(Kmisc)
x[, sets := Kmisc::str_sort(rel)]
x[, .(sum = sum(Value)), by = sets]
Upvotes: 0
Reputation: 118779
Use pmin
and pmax
..
require(data.table) # v1.9.6
dt = fread("Country1 Country2 Value Category
A A 4 1
A B 2 1
A C 9 1
B A 3 2
B D 4 1
C A 2 2
D C 7 2")
dt[, .(total = sum(Value)),
by=.(Country1 = pmin(Country1, Country2),
Country2 = pmax(Country1, Country2))]
# Country1 Country2 total
# 1: A A 4
# 2: A B 5
# 3: A C 11
# 4: B D 4
# 5: C D 7
If you want this within Category
, just add it as well to by
.
Upvotes: 7
Reputation: 14842
How you "sum all mutual bipartite relations" depends on what you want to do.
To count the number of relations in each category:
x[, .N, by=Category]
To sum up the value
for all relationships in each category:
x[, sum(Value), by=Category]
or this for prettier output:
x[, list(TotalValue = sum(Value)), by=Category]
Upvotes: 1