user3683477
user3683477

Reputation: 29

Creating a column by checking NA cells in multiple columns R

I have a dataset with many missing values:

              Q1            Q2            Q3            Q4
1                                                         
2                                                         
3                                                         
4                                                         
5                                                         
6                                                         
7                                                         
8                                             previous job
9                                                         
10                 current job                            
11                                                        
12                                                        
13                              previous job              
14                                                        
15                                                        
16                                                        
17                                                        
18                 current job                            
19  previous job                                          
20                                                        
21  previous job                                          
22                               current job              
23   current job                                          
24                               current job              
25                              previous job              
26                                                        
27   current job                                          
28                                                        
29                 current job                            
30  previous job                                        

I would like to create a column and check row by row if Q2, Q3, or Q4 is empty or not(doesn't matter what is written). If at least one of them is not empty I would like to write "yes", otherwise "no". How should I do that?

Upvotes: 0

Views: 24

Answers (1)

akrun
akrun

Reputation: 887148

We can use rowSums to create a logical vector based on the occurence of blank ("") or NA (is.na), check if the rowwise sum is greater than 0, and if so, 'yes', or otherwise 'no'

df1$flag <- ifelse(rowSums(df1 == ""|is.na(df1)) > 0, "yes", "no")

If we want to select certain columns, either use position indexing (2:4 - for columns 2 to 4) or its column names

df1$flag <- ifelse(rowSums(df1[2:4] == ""|is.na(df1[2:4])) > 0, "yes", "no")

Or another option is to loop over the columns and apply the logical condition

c("no", "yes")[1 + (Reduce(`+`, lapply(df1, function(x) x == ""| is.na(x))) > 0)]

Upvotes: 1

Related Questions