trminh89
trminh89

Reputation: 897

Divide one column of data frame by condition from another column

I have a data frame with 2 columns like this:

cond  val
1      5
2      18
2      18
2      18
3      30
3      30

I want to change values in val in this way:

   cond  val
    1      5   # 5 = 5/1  (only "1" in cond column)
    2      6   # 6 = 18/3 (there are three "2" in cond column)
    2      6
    2      6
    3      15  # 15 = 30/2  
    3      15

How to achieve this?

Upvotes: 2

Views: 2345

Answers (5)

Steven Beaupré
Steven Beaupré

Reputation: 21641

Here's the dplyr way:

library(dplyr)
df %>%
  group_by(cond) %>%
  mutate(val = val / n())

Which gives:

#Source: local data frame [6 x 2]
#Groups: cond [3]
#
#   cond   val
#  (int) (dbl)
#1     1     5
#2     2     6
#3     2     6
#4     2     6
#5     3    15
#6     3    15

The idea is to divide val by the number of observations in the current group (cond) using n()

Upvotes: 3

Gregor Thomas
Gregor Thomas

Reputation: 146164

In base R

df$result = df$val / ave(df$cond, df$cond, FUN = length)

The ave() divides up the cond column by its unique values and takes the length of each subvector, i.e., the denominator you ask for.

Upvotes: 2

Jaap
Jaap

Reputation: 83275

A base R solution:

# method 1:
mydf$val <- ave(mydf$val, mydf$cond, FUN = function(x) x = x/length(x))
# method 2:
mydf <- transform(mydf, val = ave(val, cond, FUN = function(x) x = x/length(x)))

which gives:

  cond val
1    1   5
2    2   6
3    2   6
4    2   6
5    3  15
6    3  15

Upvotes: 3

lmo
lmo

Reputation: 38520

Here is a base R answer that will work if cond is an ID variable:

# get length of repeats
temp <- rle(df$cond)
temp <- data.frame(cond=temp$values, lengths=temp$lengths)

# merge onto data.frame
df <- merge(df, temp, by="cond")
df$valNew <- df$val / df$lengths

Upvotes: 0

nrussell
nrussell

Reputation: 18612

This seems like an appropriate situation for data.table:

library(data.table)
(dt <- data.table(df)[,val := val / .N, by = cond][])
#    cond val
# 1:    1   5
# 2:    2   6
# 3:    2   6
# 4:    2   6
# 5:    3  15
# 6:    3  15 

df <- read.table(
    text = "cond  val
    1      5
    2      18
    2      18
    2      18
    3      30
    3      30",
    header = TRUE,
    colClasses = "numeric"
)

Upvotes: 2

Related Questions