Reputation: 558
I have two data frames
, one containing the predictors
and one containing the different categories
I want to predict. Both of the data frames contain a column named geoid
. Some of the rows of my predictors contains NA
values, and I need to remove these.
After extracting the geoid
value of the rows containing NA
values, and removing them from the predictors
data frame
I need to remove the corresponding rows from the categories
data frame
as well.
It seems like a rather basic operation but the code won't work.
categories <- as.data.frame(read.csv("files/cat_df.csv"))
predictors <- as.data.frame(read.csv("files/radius_100.csv"))
NA_rows <- predictors[!complete.cases(predictors),]
geoids <- NA_rows['geoid']
clean_categories <- categories[!(categories$geoid %in% geoids),]
None of the rows in categories/clean_categories
are removed.
A typical geoid value is US06140231
. typeof(categories$geoid)
returns integer
.
Upvotes: 0
Views: 80
Reputation: 1731
I can't say this is it, but a very basic typo won't be doing what you want, try this correction
clean_categories <- categories[!(categories$geoid %in% geoids),]
Almost certainly this is what you meant to happen in that line. You want to negate the result of the %in% operator. You don't include a reproducible example so I can't say whether the whole thing will do as you want.
Upvotes: 1