Reputation: 33
I have a data frame with several columns I want to merge only 5 of those.
ID | result___1 | result___2 | result___3 | result___4 | result___5 |
---|---|---|---|---|---|
1 | First | 0 | 0 | 0 | 0 |
3 | Second | 0 | First | 0 | 0 |
4 | First | 0 | First | 0 | 0 |
5 | Third | 0 | First | Second | 0 |
6 | Second | 0 | First | 0 | Other |
7 | Second | 0 | First | 0 | Other |
The merge of those will be in a new columne, seperated by "," if needed. For example for the first raw I want to keep only "First", for the second raw "Second, First" etc. My last column has text where it is different each time (because it was filled by interviewees). So the results it could be "Second, First, the interviewees text".
Following <https://stackoverflow.com/questions/50874955/merging-multiple-columns-in-a-dataframe-based-on-condition-in-r >
This is my code until now. But I don't get what I want
df %>% mutate_at(vars(starts_with("result___")), funs(na_if(.,"0"))) %>%
mutate(demo_race = coalesce(df$result___1,df$result___2,df$result___3,df$result___4,df$result___5)) %>%
select(ID, df$resultsnew)
Any hint how can I solve my problem?
Thank you in advance
Upvotes: 1
Views: 136
Reputation: 500
First, I recreated your data:
df <- data.frame(
ID = c(1,3,4,5,6,7),
result__1 = c("First", "Second", "First", "Third","Second", "Second"),
result__2 = c(0,0,0,0,0,0),
result__3 = c(0, "First","First","First","First","First"),
result__4 = c(0,0,0,"Second",0,0),
result__5 = c(0,0,0,0,"Other","Other"),
text = c(rep("sample text",6))
)
You can accomplish this using a combination of unite
from tidyr
and str_remove_all
from stringr
:
tidyr::unite(df, text, sep = ",", remove = FALSE) %>% # pasting the columns together
mutate(new_text = stringr::str_remove_all(text, ",0|[0-9]+,")) %>% # getting rid of the IDs and the 0s
select(-text) # we don't need this column anymore
ID result__1 result__2 result__3 result__4 result__5 new_text
1 1 First 0 0 0 0 First,sample text
2 3 Second 0 First 0 0 Second,First,sample text
3 4 First 0 First 0 0 First,First,sample text
4 5 Third 0 First Second 0 Third,First,Second,sample text
5 6 Second 0 First 0 Other Second,First,Other,sample text
6 7 Second 0 First 0 Other Second,First,Other,sample text
Where the new_text
column is your desired output. sample text
is what I used to represent the interviewees text column you talked about.
Upvotes: 1