JL709
JL709

Reputation: 27

Subset removal when new variables are completed

I am working on subsetting multiple variables in a dataset to remove data points that are not useful. When I enter the subset command for the first variable and check the dataset, the variable has been properly subset. However, after doing the same with the second variable, the first is no longer subset in the dataset. It seems as though the second subset command is overriding the first. In the example I came up with below the first variable (Height) is no longer subset once I subset the second variable (Weight). Any thoughts on how to resolve this?

rTestDataSet = TestDataSet
rTestDataSet = subset(TestDataSet, TestDataSet$Height < 4)
rTestDataSet = subset(TestDataSet, TestDataSet$Weight < 3)

Upvotes: 2

Views: 81

Answers (2)

DeduciveR
DeduciveR

Reputation: 1702

Why not use tidyverse? Chain the operations together to create your own logic. Instead of subset you can use filter to get the rows you want conditionally:

library(tidyverse)
TestDataSet %>%
  filter(Height < 4) %>%
  filter(Weight < 3)

or

TestDataSet %>%
  filter(Height < 4 & Weight < 3)

Upvotes: 1

user1658170
user1658170

Reputation: 858

You are applying both subsets to the original data. What you need to do is apply one subset, save it to a variable and then apply the second subset to this new variable. Also as already pointed out you don't need the $ when using subset.

try this:

Make some reproducible data:

 set.seed(50)
 TestDataSet <- data.frame("Height" = c(sample(1:10,30, replace = T)), Weight = sample(1:10,30, replace = T) )

 rTestDataSet = TestDataSet
 rTestDataSet = subset(rTestDataSet, Height < 4)

rTestDataSet 
   Height Weight
3       3      5
6       1      7
9       1      4
10      2      5
12      3      9
14      1      1
15      3      1
19      1      8
20      2      9
22      2      8
28      3      6

 rTestDataSet = subset(rTestDataSet, Weight < 3)

rTestDataSet
Height Weight
14      1      1
15      3      1

Upvotes: 1

Related Questions