Reputation: 27
I have this dataframe of 4 columns and I want to delete those rows which are not less than equal to difference of 100 to col_A
User Col_A Col_B Col_C
a 1450 1450 1250
b 1450 1450 1440
c 1320 1280 1225
d 1300 1300 1190
e 1295 1250 1200
f 1295 1295 1250
g 1295 1240 1000
h 1295 1295 1260
j 1295 1295 1260
k 1295 1295 1260
l 1295 1295 1190
m 1295 1295 1260
n 1250 1250 1220
o 1200 1200 1160
p 1180 1180 1070
q 1030 1030 990
I want to retain only those rows based on data in Column_C in my dataframe which are less then equal to difference of 100 to Column_A
User Col_A Col_B Col_C
b 1450 1450 1440
c 1320 1280 1225
e 1295 1250 1200
f 1295 1295 1250
h 1295 1295 1260
j 1295 1295 1260
k 1295 1295 1260
m 1295 1295 1260
n 1250 1250 1220
o 1200 1200 1160
q 1030 1030 990
I am trying to work with following code to filter the rows but unable to get the desired output.
df<- filter(df, Col_C<=Col_A & Col_C>=Col_A-100)
Thanks in advance.
Upvotes: 0
Views: 37
Reputation: 886948
Using data.table
library(data.table)
setDT(df1)[(Col_A - Col_C) <= 100]
df1 <- structure(list(User = structure(1:16, .Label = c("a", "b", "c",
"d", "e", "f", "g", "h", "j", "k", "l", "m", "n", "o", "p", "q"
), class = "factor"), Col_A = c(1450L, 1450L, 1320L, 1300L, 1295L,
1295L, 1295L, 1295L, 1295L, 1295L, 1295L, 1295L, 1250L, 1200L,
1180L, 1030L), Col_B = c(1450L, 1450L, 1280L, 1300L, 1250L, 1295L,
1240L, 1295L, 1295L, 1295L, 1295L, 1295L, 1250L, 1200L, 1180L,
1030L), Col_C = c(1250L, 1440L, 1225L, 1190L, 1200L, 1250L, 1000L,
1260L, 1260L, 1260L, 1190L, 1260L, 1220L, 1160L, 1070L, 990L)),
class = "data.frame", row.names = c(NA, -16L))
Upvotes: 1
Reputation: 388817
You could use subset
subset(df, (Col_A - Col_C) <= 100)
# User Col_A Col_B Col_C
#2 b 1450 1450 1440
#3 c 1320 1280 1225
#5 e 1295 1250 1200
#6 f 1295 1295 1250
#8 h 1295 1295 1260
#9 j 1295 1295 1260
#10 k 1295 1295 1260
#12 m 1295 1295 1260
#13 n 1250 1250 1220
#14 o 1200 1200 1160
#16 q 1030 1030 990
Or with filter
dplyr::filter(df, (Col_A - Col_C) <= 100)
data
df <- structure(list(User = structure(1:16, .Label = c("a", "b", "c",
"d", "e", "f", "g", "h", "j", "k", "l", "m", "n", "o", "p", "q"
), class = "factor"), Col_A = c(1450L, 1450L, 1320L, 1300L, 1295L,
1295L, 1295L, 1295L, 1295L, 1295L, 1295L, 1295L, 1250L, 1200L,
1180L, 1030L), Col_B = c(1450L, 1450L, 1280L, 1300L, 1250L, 1295L,
1240L, 1295L, 1295L, 1295L, 1295L, 1295L, 1250L, 1200L, 1180L,
1030L), Col_C = c(1250L, 1440L, 1225L, 1190L, 1200L, 1250L, 1000L,
1260L, 1260L, 1260L, 1190L, 1260L, 1220L, 1160L, 1070L, 990L)),
class = "data.frame", row.names = c(NA, -16L))
Upvotes: 1