Reputation: 15

Deleting observations in group categories in R using condition

As I study R, I had the new question. Can you tell me, i have the goods categories. The goods of each categories have the price.Is it possible to write string in R so that, if value of observation exceeds the average of group by more than 500 000 in this commodity category, then this obs. would removed from the analysis. I.E, I need from all commodity categories (grouping variable) to remove the observations, values of which more the 500 000 of the average for the group.

data = read.table(textConnection("
cat price
1   100000
1   200000
1   300000
1   400000
1   1000000
2   100000
2   200000
2   50000
2   100000
2   1000000
2   2000000
"),head=TRUE)

Upvotes: 0

Answers (3)

ulfelder

Reputation: 5335

Using dplyr:

library(dplyr)

data %>%
  group_by(cat) %>%
  filter(price - mean(price) <= 500000)

Result:

Source: local data frame [9 x 2]
Groups: cat [2]

    cat   price
  <int>   <int>
1     1  100000
2     1  200000
3     1  300000
4     1  400000
5     2  100000
6     2  200000
7     2   50000
8     2  100000
9     2 1000000

Upvotes: 1

akrun

Reputation: 887048

With data.table, we convert the 'data.frame' to 'data.table' (setDT(data)), grouped by 'cat', we subset the rows of Subset of Data.table (.SD) using the logical condition

library(data.table)
setDT(data)[,  .SD[(price - mean(price)) <= 500000], cat]

Or we can use the row index (.I)

setDT(data)[data[,  .I[(price - mean(price)) <= 500000], cat]$V1]
#    cat   price
#1:   1  100000
#2:   1  200000
#3:   1  300000
#4:   1  400000
#5:   2  100000
#6:   2  200000
#7:   2   50000
#8:   2  100000
#9:   2 1000000

Upvotes: 1

Erdem Akkas

Reputation: 2070

With base:

subset(data,!data$price>(ave(mydata$price,mydata$cat)+500000))

Result:

cat   price 
1    1  100000        
2    1  200000        
3    1  300000        
4    1  400000        
6    2  100000        
7    2  200000        
8    2   50000        
9    2  100000        
10   2 1000000

Upvotes: 4

Deleting observations in group categories in R using condition

Answers (3)

Related Questions