For rows with duplicates, create new column with conditional value

Question

Some sample data:

x <- data.frame(c(1992, 1992, 1992, 1994, 1994, 1995, 1992, 1992, 1993), c("Taliban", "Taliban", "Taliban", "Taliban", "Taliban", "Taliban", "Afghanistan", "Afghanistan", "Afghanistan"), c(300, 300, 300, 100, 100, 250, 25, 25, 60))
colnames(x) <- c("year", "actor", "deaths")
x$year <- as.integer(x$year) # this is to match the class of my actual data

My goal is to create and populate a new column "even_deaths" with a value based on the following conditions: if more than one row where the year and actor match, then "even_deaths" will be the number of "deaths" divided by the number of duplicate rows.

In short, I want the new dataframe to look like this:

year          actor          deaths          even_deaths
1992          Taliban        300             100
1992          Taliban        300             100
1992          Taliban        300             100
1994          Taliban        100             50
1994          Taliban        100             50
1995          Taliban        250             250
1992          Afghanistan    25              12.5
1992          Afghanistan    25              12.5
1993          Afghanistan    60              60

The dataset is particularly large with with over 1k actors so I'm hoping wouldn't need to specify each individual one. Also, ideally I could perform whatever action on just the rows that have duplicates (as opposed to just duplicate and unique rows). Any help is very much appreciated and I apologize if the wording is vague.

Cheers,

Ardeshir

For rows with duplicates, create new column with conditional value

Answers (1)

Related Questions