Reputation: 29
I have a dataset with many missing values:
Q1 Q2 Q3 Q4
1
2
3
4
5
6
7
8 previous job
9
10 current job
11
12
13 previous job
14
15
16
17
18 current job
19 previous job
20
21 previous job
22 current job
23 current job
24 current job
25 previous job
26
27 current job
28
29 current job
30 previous job
I would like to create a column and check row by row if Q2, Q3, or Q4 is empty or not(doesn't matter what is written). If at least one of them is not empty I would like to write "yes", otherwise "no". How should I do that?
Upvotes: 0
Views: 24
Reputation: 887148
We can use rowSums
to create a logical vector based on the occurence of blank (""
) or NA
(is.na
), check if the rowwise sum is greater than 0, and if so, 'yes', or otherwise 'no'
df1$flag <- ifelse(rowSums(df1 == ""|is.na(df1)) > 0, "yes", "no")
If we want to select certain columns, either use position indexing (2:4 - for columns 2 to 4) or its column names
df1$flag <- ifelse(rowSums(df1[2:4] == ""|is.na(df1[2:4])) > 0, "yes", "no")
Or another option is to loop over the columns and apply the logical condition
c("no", "yes")[1 + (Reduce(`+`, lapply(df1, function(x) x == ""| is.na(x))) > 0)]
Upvotes: 1