Reputation: 147
I have two data frames df1 and df2:
group=c("Group 1", "Group 2", "Group3","Group 1", "Group 2", "Group3")
year=c("2000","2000","2000", "2015", "2015", "2015")
items=c("12", "10", "15", "5", "10", "7")
df1=data.frame(group, year, items)
year=c("2000", "2015")
items=c("37", "22")
df2=data.frame(year,items)
df1 contains the number of items per year and separated by group, and df2 contains the total number of items per year
I'm trying to create a for loop that will calculate the proportion of items for each group type. I'm trying to do something like:
df1$Prop="" #create empty column called Prop in df1
for(i in 1:nrow(df1)){
df1$Prop[i]=df1$items/df2$items[df2$year==df1$year[i]]
}
where the loop is supposed to get the proportion for each type of item (by getting the value from df1 and dividing by the total in df2) and list it in a new column but this code isn't working.
Upvotes: 5
Views: 2720
Reputation: 4472
dplyr
equivalent to David's data.table
solution
library(dplyr)
df1$items = as.integer(as.vector(df1$items))
df1 %>% group_by(year) %>% mutate(Prop = items / sum(items))
#Source: local data frame [6 x 4]
#Groups: year
# group year items Prop
#1 Group 1 2000 12 0.3243243
#2 Group 2 2000 10 0.2702703
#3 Group3 2000 15 0.4054054
#4 Group 1 2015 5 0.2272727
#5 Group 2 2015 10 0.4545455
#6 Group3 2015 7 0.3181818
plyr
alternative
ddply(df1, .(year), mutate, prop = items/sum(items))
lapply
alternative
do.call(rbind,lapply(split(df1, df1$year),
function(x){ x$prop = x$item / sum(x$item); x}))
Upvotes: 2
Reputation: 92292
You don't need df2
really, here's a simple solution using data.table
and only df1
(I'm assuimg items
is numeric column, if not, you''ll need to convert it to one setDT(df1)[, items := as.numeric(as.character(items))]
)
library(data.table)
setDT(df1)[, Prop := items/sum(items), by = year]
df1
# group year items Prop
# 1: Group 1 2000 12 0.3243243
# 2: Group 2 2000 10 0.2702703
# 3: Group3 2000 15 0.4054054
# 4: Group 1 2015 5 0.2272727
# 5: Group 2 2015 10 0.4545455
# 6: Group3 2015 7 0.3181818
Another way is if you already have df2
, you can join between the two and calculate Prop
while doing so (again, I'm assuming items
is numeric in real data)
setkey(setDT(df1), year)[df2, Prop := items/i.items]
A base R alternative
with(df1, ave(items, year, FUN = function(x) x/sum(x)))
## [1] 0.3243243 0.2702703 0.4054054 0.2272727 0.4545455 0.3181818
Upvotes: 4