Benoit B.
Benoit B.

Reputation: 12054

Add a group column to a dataframe

I have a dataframe df like this

      V1 V2
1  13219  0
2   6358  1
3   4384  2
4   3359  3
5   2820  4
6   2466  5
7   2144  6
8   1941  7
9   1778  8
10  1550  9

and I would like to add a "group" column which will correspond to different value of df$V2.

At df$V2 = 0 group will be A
At df$V2 >0 <=5 group will be B
At df$V2 >= 6 group will be C

The idea would be to obtain something like this:

      V1 V2 Grp
1  13219  0 A
2   6358  1 B
3   4384  2 B
4   3359  3 B
5   2820  4 B
6   2466  5 B
7   2144  6 C
8   1941  7 C
9   1778  8 C
10  1550  9 C

This seems straight toward at first, but googling around doesn't help much. Advices much appreciated.

Upvotes: 3

Views: 197

Answers (2)

Dominic Comtois
Dominic Comtois

Reputation: 10411

Using indexing, this can be done quite easily:

df$group <- NA
df$group[df$V2 == 0] <- "A"
df$group[df$V2 > 0]  <- "B"
df$group[df$V2 >= 6] <- "C"

Note that the 3rd and 4th statements must be run in that sequence. Otherwise -- if you didn't want to have to run the "C" assignations after the "B" assignations, you'd need to define the indexing for the "B" assignations more thoroughly:

df$group[df$V2 > 0 & df$V2 < 6] <- "B"

Results

      V1 V2 group
1  13219  0     A
2   6358  1     B
3   4384  2     B
4   3359  3     B
5   2820  4     B
6   2466  5     B
7   2144  6     C
8   1941  7     C
9   1778  8     C
10  1550  9     C

Data

df <- read.csv(text="V1,V2
13219,0
6358,1
4384,2
3359,3
2820,4
2466,5
2144,6
1941,7
1778,8
1550,9")

Upvotes: 3

akrun
akrun

Reputation: 887168

You could use cut or findInterval

df$Grp <- with(df, LETTERS[1:3][cut(V2, breaks=c(-Inf,0, 5, Inf),
            labels=FALSE)])

df$Grp <-  with(df, LETTERS[1:3][findInterval(V2, c(-Inf,0, 5,Inf)+1)])

df
#      V1 V2 Grp
#1  13219  0   A
#2   6358  1   B
#3   4384  2   B
#4   3359  3   B
#5   2820  4   B
#6   2466  5   B
#7   2144  6   C
#8   1941  7   C
#9   1778  8   C
#10  1550  9   C

Or

 with(df, LETTERS[c(2,1,3)][1+(V2==0) + 2*(V2 >=6)])
 #[1] "A" "B" "B" "B" "B" "B" "C" "C" "C" "C"

Upvotes: 4

Related Questions