Reputation: 129
I have this fata frame Xi, I need to make a new one which reflects the following operation: the trade_value_usd
in Xi (grouped by reporter and commodity, as in Xi) / the sum of trade_value_usd
of all commodities, grouped by reporter
> head(Xi)
reporter commodity trade_value_usd
1 Algeria Live animals 11075
2 Andorra Live animals 120732
3 Argentina Live animals 8057242
4 Australia Live animals 634849604
5 Austria Live animals 60349499
6 Bahrain Live animals 280601
I tried
SH <- Xi %>% group_by(reporter, commodity) %>% summarize(SH = trade_value_usd) / sum(trade_value_usd))
Final result should look something like this:
> SH
# A tibble: 168 x 2
reporter SH
<chr> <dbl>
1 Albania 0.6
2 Algeria 0.2
3 Andorra 0.5
4 Anguilla 0.9
5 Argentina 0.1
6 Armenia 0.8
Upvotes: 1
Views: 70
Reputation: 363
library(tidyverse)
library(magrittr)
reporter <- rep(c("Algeria", "Andorra","Argentina","Australia","Austria","Bahrain"), 2)
commodity <- c(rep("Live animals", 6), rep("Maize",6))
trade_value_usd <- c(seq(from=5000, to=60000, by=5000))
Xi <- cbind.data.frame(reporter, commodity, trade_value_usd)
Xi
reporter commodity trade_value_usd
1 Algeria Live animals 5000
2 Andorra Live animals 10000
3 Argentina Live animals 15000
4 Australia Live animals 20000
5 Austria Live animals 25000
6 Bahrain Live animals 30000
7 Algeria Maize 35000
8 Andorra Maize 40000
9 Argentina Maize 45000
10 Australia Maize 50000
11 Austria Maize 55000
12 Bahrain Maize 60000
Xi %>% group_by(commodity) %>%
mutate(sh = format(round(trade_value_usd/sum(trade_value_usd),3),
nsmall = 3), trade_value_usd=NULL)
reporter commodity sh
<chr> <chr> <chr>
1 Algeria Live animals 0.048
2 Andorra Live animals 0.095
3 Argentina Live animals 0.143
4 Australia Live animals 0.190
5 Austria Live animals 0.238
6 Bahrain Live animals 0.286
7 Algeria Maize 0.123
8 Andorra Maize 0.140
9 Argentina Maize 0.158
10 Australia Maize 0.175
11 Austria Maize 0.193
12 Bahrain Maize 0.211
Upvotes: 0
Reputation: 306
If you need sum of usd per reporter
# artificial data generation
set.seed(4422)
Xi <- tibble("reporter" = rep(LETTERS[1:10], 2),
"commodity" = sort(rep(letters[1:2], 10)),
"trade_value_usd" = ceiling(runif(20, 6000, 450000))
)
countrySum <- aggregate(trade_value_usd ~ reporter, Xi, sum)
countrySum
reporter trade_value_usd
1 A 679399
2 B 794578
3 C 176858
4 D 697695
5 E 528665
6 F 496006
7 G 635249
8 H 824737
9 I 619079
10 J 640500
If you need usd fraction of each commodity per reporter (sum of fractions for each reporter is 1) could be
# Fraction of usd by reporter
SH <- NULL
# need as.factor because Xi is a tibble
countries <- levels(as.factor(Xi$reporter))
for (i in seq_along(countries)) {
uu <-as.vector(
Xi[Xi$reporter == countries[[i]], 3] / sum(Xi[Xi$reporter == countries[[i]], 3])
)
names(uu) <- "fraction"
SH <- rbind(SH, cbind(Xi[Xi$reporter == countries[[i]], ], uu))
}
> SH
reporter commodity trade_value_usd fraction
1 A a 426532 0.6278078
2 A b 252867 0.3721922
3 B a 345296 0.4345653
4 B b 449282 0.5654347
5 C a 100431 0.5678624
6 C b 76427 0.4321376
7 D a 335791 0.4812862
8 D b 361904 0.5187138
9 E a 141603 0.2678502
10 E b 387062 0.7321498
11 F a 212146 0.4277085
12 F b 283860 0.5722915
13 G a 409227 0.6441994
14 G b 226022 0.3558006
15 H a 397377 0.4818227
16 H b 427360 0.5181773
17 I a 230093 0.3716699
18 I b 388986 0.6283301
19 J a 327676 0.5115941
20 J b 312824 0.4884059
Upvotes: 1
Reputation: 3876
Not sure if this is what you need:
library(dplyr)
df %>%
group_by(reporter) %>%
mutate(trade_volume_per_reporter = sum(trade_value_usd)) %>%
group_by(reporter, commodity) %>%
summarise(trade_value_usd / trade_volume_per_reporter)
Upvotes: 1