Reputation: 67
I am relatively new to R and want to get all the rows that have ALL columns with non-zero values so basically a row that has even a single column with zero or NA value; I dont want that row at all.
Here is how my dataframe looks like; Any help is really appreciated.
EDITED :
After executing dput(head(mydata,20))
; it looks something like below
structure(list(Q3_1 = c("1", "1", "1", "5", "6", "3", "2", "",
"", "", "", "", "", "", ""), Q6_1 = c("2", "2", "1", "2", "6",
"3", "2", "", "", "", "", "", "", "", ""), Q12_1 = c("1", "1",
"8", "7", "1", "4", "7", "", "", "", "", "", "", "", ""), Q15_1 = c("1",
"1", "4", "5", "1", "9", "8", "", "", "", "", "", "", "", ""),
Q18_1 = c("2", "2", "1", "6", "4", "9", "3", "", "", "6",
"", "", "", "", ""), Q21_1 = c("1", "1", "2", "3", "3", "4",
"8", "", "", "", "", "", "", "", ""), Q24_1 = c("1", "1",
"1", "2", "2", "8", "8", "", "", "", "", "", "", "", ""),
Q27_1 = c("1", "2", "3", "4", "5", "2", "9", "", "", "",
"", "", "", "", ""), Q30_1 = c("2", "2", "2", "2", "2", "2",
"2", "", "", "", "", "", "", "", ""), Q36_1 = c("6", "2",
"2", "2", "2", "6", "6", "", "", "4", "", "", "", "9", "7"
), Q39_1 = c("10", "2", "2", "2", "2", "2", "2", "4", "2",
"", "", "3", "7", "10", "9"), Q42_1 = c("1", "2", "3", "4",
"5", "6", "7", "9", "5", "4", "", "", "7", "", "5"), Q45_1 = c("1",
"2", "2", "2", "2", "2", "2", "", "", "5", "", "", "8", "4",
""), Q48_1 = c("7", "2", "3", "4", "5", "6", "8", "", "6",
"", "5", "", "1", "8", ""), Q51_1 = c("1", "2", "3", "8",
"3", "4", "4", "9", "8", "6", "4", "8", "1", "7", "7"), Q54_1 = c("1",
"2", "3", "7", "4", "4", "4", "1", "4", "6", "", "1", "",
"", "2"), Q60_1 = c("1", "2", "3", "3", "3", "3", "3", "3",
"6", "", "", "10", "", "", "9"), Q63_1 = c("1", "1", "1",
"1", "1", "1", "1", "1", "1", "", "", "1", "5", "", ""),
Q66_1 = c("4", "4", "4", "7", "6", "4", "4", "7", "3", "",
"7", "", "4", "", "1"), Q69_1 = c("3", "3", "5", "4", "7",
"4", "5", "5", "5", "", "", "", "7", "", "1"), Q72_1 = c("3",
"4", "5", "4", "5", "6", "6", "3", "8", "", "8", "1", "",
"7", "6"), Q74_1 = c("4", "5", "5", "5", "2", "4", "5", "4",
"4", "5", "4", "3", "3", "5", "2"), Q75_1 = c("5", "2", "3",
"5", "4", "3", "4", "4", "3", "4", "1", "2", "2", "4", "5"
), Q76_1 = c("1", "2", "1", "4", "5", "2", "5", "1", "2",
"5", "5", "2", "2", "2", "5")), row.names = c(5L, 7L, 9L,
11L, 12L, 16L, 17L, 18L, 21L, 24L, 26L, 32L, 34L, 35L, 40L), class = "data.frame")
Upvotes: 1
Views: 4089
Reputation:
If I were you, I would create new column which has the value of product of all columns (stg like this: https://stackoverflow.com/a/27534012/12591329) then create a new dataframe with all rows where the value of the new column is greater than zero.
Upvotes: 0
Reputation: 1364
Base on @Gregor Thomas solution, if your data has any 0 values, this solution will filter those out.
library(tidyverse)
df[df==''] <- NA
df2 <- df %>%
filter(!is.na(df)) %>%
filter_at(vars(starts_with("Q")), all_vars(. != 0))
Upvotes: 0
Reputation: 146224
I would first change your blank ""
values to NA
missing values, then the functions na.omit
or complete.cases
can be used to do what you want:
mydata[mydata == ""] <- NA
# then either of these should work
result1 <- na.omit(mydata)
result2 <- mydata[complete.cases(mydata), ]
You may want to address the ""
earlier in your workstream - the function you use to read the data into R (maybe read.csv()
or similar) probably has a way to specify that blanks should be treated as missing values. For read.csv
, you would use na.strings = ""
. Right now, all your columns are character class, and you probably want to change them to numeric after getting rid of the blanks. If you use the na.strings = ""
, your columns will automatically be loaded as numeric columns.
Upvotes: 1