Roberto
Roberto

Reputation: 96

Best way to add a new column with formula in R

I have a dataframe with one column of categories and one column of value (let's say "v"). I need to create a new column with the following value: v - min(v) / min(v). For example:

Cat 1  |  Value
A      |   1
A      |   3
B      |   2
B      |   1

Must be:

Cat 1  |  Value   | NewCol
A      |   1      | (1-1)/1 = 0
A      |   3      | (3-1)/1 = 2
B      |   4      | (4-2)/2 = 1
B      |   2      | (2-2)/2 = 0

I'm using the following code:

for (i in unique(fullDataset$Cat)) {
    fullDataset[which(fullDataset$Cat==i),"NewCol"] = min(fullDataset[which(fullDataset$Cat==i),"Value"])
}
fullDataset$NewCol <- (fullDataset$Value - fullDataset$NewCol) / fullDataset$NewCol

But it's taking hours to run... is there a fastest way to do that?

Thank you!

Upvotes: 0

Views: 691

Answers (2)

gented
gented

Reputation: 1687

You can use the data.table package with inline definitions per group as

library('data.table')
DT <- DT[,
         new := (Value - min(Value))/min(Value),
         by = 'Cat_1'
         ]

Upvotes: 1

jeremycg
jeremycg

Reputation: 24945

You can use dplyr:

library(dplyr)
fullDataset %>% group_by(Cat) %>%
                mutate(newcol = (Value - min(Value))/min(Value))


Source: local data frame [4 x 3]
Groups: Cat [2]
     Cat Value newcol
  (fctr) (int)  (int)
1      A     1      0
2      A     3      2
3      B     4      1
4      B     2      0

First we group by Cat, then mutate a new column, newcol which is the Value, minus the min Value, divided by the min Value.

Upvotes: 1

Related Questions