Meli
Meli

Reputation: 33

Merging multiple columns based on conditions in R

I have a data frame with several columns I want to merge only 5 of those.

ID result___1 result___2 result___3 result___4 result___5
1 First 0 0 0 0
3 Second 0 First 0 0
4 First 0 First 0 0
5 Third 0 First Second 0
6 Second 0 First 0 Other
7 Second 0 First 0 Other

The merge of those will be in a new columne, seperated by "," if needed. For example for the first raw I want to keep only "First", for the second raw "Second, First" etc. My last column has text where it is different each time (because it was filled by interviewees). So the results it could be "Second, First, the interviewees text".

Following <https://stackoverflow.com/questions/50874955/merging-multiple-columns-in-a-dataframe-based-on-condition-in-r >

This is my code until now. But I don't get what I want

df %>% mutate_at(vars(starts_with("result___")), funs(na_if(.,"0"))) %>%
  mutate(demo_race = coalesce(df$result___1,df$result___2,df$result___3,df$result___4,df$result___5)) %>%
  select(ID, df$resultsnew)

Any hint how can I solve my problem?

Thank you in advance

Upvotes: 1

Views: 136

Answers (1)

NovaEthos
NovaEthos

Reputation: 500

First, I recreated your data:

df <- data.frame(
  ID = c(1,3,4,5,6,7),
  result__1 = c("First", "Second", "First", "Third","Second", "Second"),
  result__2 = c(0,0,0,0,0,0),
  result__3 = c(0, "First","First","First","First","First"),
  result__4 = c(0,0,0,"Second",0,0),
  result__5 = c(0,0,0,0,"Other","Other"),
  text = c(rep("sample text",6))
)

You can accomplish this using a combination of unite from tidyr and str_remove_all from stringr:

tidyr::unite(df, text, sep = ",", remove = FALSE) %>% # pasting the columns together
  mutate(new_text = stringr::str_remove_all(text, ",0|[0-9]+,")) %>%  # getting rid of the IDs and the 0s
  select(-text) # we don't need this column anymore

  ID result__1 result__2 result__3 result__4 result__5                       new_text
1  1     First         0         0         0         0              First,sample text
2  3    Second         0     First         0         0       Second,First,sample text
3  4     First         0     First         0         0        First,First,sample text
4  5     Third         0     First    Second         0 Third,First,Second,sample text
5  6    Second         0     First         0     Other Second,First,Other,sample text
6  7    Second         0     First         0     Other Second,First,Other,sample text

Where the new_text column is your desired output. sample text is what I used to represent the interviewees text column you talked about.

Upvotes: 1

Related Questions