Reputation: 27
I am working on subsetting multiple variables in a dataset to remove data points that are not useful. When I enter the subset command for the first variable and check the dataset, the variable has been properly subset. However, after doing the same with the second variable, the first is no longer subset in the dataset. It seems as though the second subset command is overriding the first. In the example I came up with below the first variable (Height) is no longer subset once I subset the second variable (Weight). Any thoughts on how to resolve this?
rTestDataSet = TestDataSet
rTestDataSet = subset(TestDataSet, TestDataSet$Height < 4)
rTestDataSet = subset(TestDataSet, TestDataSet$Weight < 3)
Upvotes: 2
Views: 81
Reputation: 1702
Why not use tidyverse
? Chain the operations together to create your own logic. Instead of subset
you can use filter
to get the rows you want conditionally:
library(tidyverse)
TestDataSet %>%
filter(Height < 4) %>%
filter(Weight < 3)
or
TestDataSet %>%
filter(Height < 4 & Weight < 3)
Upvotes: 1
Reputation: 858
You are applying both subsets to the original data. What you need to do is apply one subset, save it to a variable and then apply the second subset to this new variable. Also as already pointed out you don't need the $ when using subset.
try this:
Make some reproducible data:
set.seed(50)
TestDataSet <- data.frame("Height" = c(sample(1:10,30, replace = T)), Weight = sample(1:10,30, replace = T) )
rTestDataSet = TestDataSet
rTestDataSet = subset(rTestDataSet, Height < 4)
rTestDataSet
Height Weight
3 3 5
6 1 7
9 1 4
10 2 5
12 3 9
14 1 1
15 3 1
19 1 8
20 2 9
22 2 8
28 3 6
rTestDataSet = subset(rTestDataSet, Weight < 3)
rTestDataSet
Height Weight
14 1 1
15 3 1
Upvotes: 1