shrimp32
shrimp32

Reputation: 147

Create new column in data frame using a for loop to calculate value in R?

I have two data frames df1 and df2:

group=c("Group 1", "Group 2", "Group3","Group 1", "Group 2", "Group3")
year=c("2000","2000","2000", "2015", "2015", "2015")
items=c("12", "10", "15", "5", "10", "7")
df1=data.frame(group, year, items)

year=c("2000", "2015")
items=c("37", "22")
df2=data.frame(year,items)

df1 contains the number of items per year and separated by group, and df2 contains the total number of items per year

I'm trying to create a for loop that will calculate the proportion of items for each group type. I'm trying to do something like:

df1$Prop="" #create empty column called Prop in df1
for(i in 1:nrow(df1)){
  df1$Prop[i]=df1$items/df2$items[df2$year==df1$year[i]]
} 

where the loop is supposed to get the proportion for each type of item (by getting the value from df1 and dividing by the total in df2) and list it in a new column but this code isn't working.

Upvotes: 5

Views: 2720

Answers (2)

Veerendra Gadekar
Veerendra Gadekar

Reputation: 4472

dplyr equivalent to David's data.table solution

library(dplyr)

df1$items = as.integer(as.vector(df1$items))
df1 %>% group_by(year) %>% mutate(Prop = items / sum(items))

#Source: local data frame [6 x 4]
#Groups: year

#    group year items      Prop
#1 Group 1 2000    12 0.3243243
#2 Group 2 2000    10 0.2702703
#3  Group3 2000    15 0.4054054
#4 Group 1 2015     5 0.2272727
#5 Group 2 2015    10 0.4545455
#6  Group3 2015     7 0.3181818

plyr alternative

ddply(df1, .(year), mutate, prop = items/sum(items))

lapply alternative

do.call(rbind,lapply(split(df1, df1$year), 
        function(x){ x$prop = x$item / sum(x$item); x}))

Upvotes: 2

David Arenburg
David Arenburg

Reputation: 92292

You don't need df2 really, here's a simple solution using data.table and only df1 (I'm assuimg items is numeric column, if not, you''ll need to convert it to one setDT(df1)[, items := as.numeric(as.character(items))])

library(data.table)
setDT(df1)[, Prop := items/sum(items), by = year]
df1
#      group year items      Prop
# 1: Group 1 2000    12 0.3243243
# 2: Group 2 2000    10 0.2702703
# 3:  Group3 2000    15 0.4054054
# 4: Group 1 2015     5 0.2272727
# 5: Group 2 2015    10 0.4545455
# 6:  Group3 2015     7 0.3181818

Another way is if you already have df2, you can join between the two and calculate Prop while doing so (again, I'm assuming items is numeric in real data)

setkey(setDT(df1), year)[df2, Prop := items/i.items]

A base R alternative

with(df1, ave(items, year, FUN = function(x) x/sum(x)))
## [1] 0.3243243 0.2702703 0.4054054 0.2272727 0.4545455 0.3181818

Upvotes: 4

Related Questions