Tim Wilcox
Tim Wilcox

Reputation: 1331

How to remove rows that have NULL values in R

Below is the sample data and one manipulation. In the larger picture, I am reading in a bunch of excel files which are delineated by year and then taking only select columns (14 of 1000) and putting them into new data frames (df1,df2 for example). From there, I am combining these new data into one final data frame. My question is how I remove the rows in the final data frame that are populated with null values. I could filter but hoping to simply remove them in R and be done with them.

 testyear <-c(2010,2010,2010,2010,2011,2011,2011,2010)
 teststate<-c("CA", "Co", "NV", "NE", "CA", "CO","NV","NE")
 totalhousehold<-c(251,252,253,"NULL",301,302,303,"NULL")
 marriedhousehold <-c(85,86,87,"NULL",158,159,245,"NULL")


 test1<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)


 testyear<-c(2012,2012,2012,2012)
 teststate<-c("WA","OR","WY","UT")
 totalhousehold<-c(654,650,646,641)
 marriedhousehold<-c(400,399,398,395)

 test2<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)

 test3<-rbind(test1,test2)

Upvotes: 0

Views: 6510

Answers (2)

Jason Mathews
Jason Mathews

Reputation: 805

Why dplyr when it can be done simple?

test3[test3 == "NULL"] = NA
test3 <- na.omit(test3)

Upvotes: 2

akrun
akrun

Reputation: 887118

As those are character columns, we could filter across only character columns to return rows that have no "NULL" elements and change the type of the columns with type.convert

library(dplyr)
test4 <- test3 %>% 
      filter(across(where(is.character), ~ . != "NULL")) %>%
       type.convert(as.is = TRUE)

-output

> test4
   testyear teststate totalhousehold marriedhousehold
1      2010        CA            251               85
2      2010        Co            252               86
3      2010        NV            253               87
4      2011        CA            301              158
5      2011        CO            302              159
6      2011        NV            303              245
7      2012        WA            654              400
8      2012        OR            650              399
9      2012        WY            646              398
10     2012        UT            641              395
> str(test4)
'data.frame':   10 obs. of  4 variables:
 $ testyear        : int  2010 2010 2010 2011 2011 2011 2012 2012 2012 2012
 $ teststate       : chr  "CA" "Co" "NV" "CA" ...
 $ totalhousehold  : int  251 252 253 301 302 303 654 650 646 641
 $ marriedhousehold: int  85 86 87 158 159 245 400 399 398 395

or in base R, use subset with rowSums to create a logical expression

type.convert(subset(test3, !rowSums(test3 == "NULL")), as.is = TRUE)

Upvotes: 1

Related Questions