RL_Pug
RL_Pug

Reputation: 857

How to use mutate for a new column based the data in multiple existing columns

Good Morning,

this is my data set containing data of different client's races.

White  Asian  Black  Native  Islander  Other
1       0       0      0       0         0
0       1       0      0       0         0
0       0       0      1       0         0
0       0       1      0       0         0
1       0       0      0       1         0
0       0       0      0       0         1

The data is stored with a Boolean where 0 = No and 1 = Yes

So if a client has 1 for the column white, then they are white.

But if a client has a 1 for white and islander then they are multi racial.

So this would be my desired output

White  Asian  Black  Native  Islander  Other   Race
1       0       0      0       0         0     White
0       1       0      0       0         0     Asian
0       0       0      1       0         0     Native
0       0       1      0       0         0     Black
1       0       0      0       1         0     Multi-Racial 
0       0       0      0       0         1     Other

I'm familiar with mutate() but I've only used mutate based off one column.

Can anyone provide a code that can help with my desired output?

Upvotes: 0

Views: 59

Answers (1)

Andrew
Andrew

Reputation: 5138

Using ifelse() with max.col() should get you what you want. For rows that only have one value you index the name the value was in, otherwise it is "Multi-Racial"

df1$Race <- ifelse(rowSums(df1) == 1, names(df1)[max.col(df1)], "Multi-Racial")
df1
  White Asian Black Native Islander Other         Race
1     1     0     0      0        0     0        White
2     0     1     0      0        0     0        Asian
3     0     0     0      1        0     0       Native
4     0     0     1      0        0     0        Black
5     1     0     0      0        1     0 Multi-Racial
6     0     0     0      0        0     1        Other

Or, using mutate():

df1 %>%
  mutate(Race = ifelse(rowSums(.) == 1, names(.)[max.col(.)], "Multi-Racial"))

Data:

df1 <- read.table(header = T, text = "White  Asian  Black  Native  Islander  Other
1       0       0      0       0         0
0       1       0      0       0         0
0       0       0      1       0         0
0       0       1      0       0         0
1       0       0      0       1         0
0       0       0      0       0         1")

Upvotes: 2

Related Questions