Sebastian Zeki
Sebastian Zeki

Reputation: 6874

Remove rows from a dataframe based on a value in one column

I have a dataframe (imported from a csv file) as follows

moose     loose     hoose
   2        3         8
   1        3         4
   5        4         2
   10       1         4

The R code should generate a mean column and then I would like to remove all rows where the value of the mean is <4 so that I end up with:

 moose     loose     hoose     mean 
   2        3         8        4.3
   1        3         4        2.6
   5        4         2        3.6
   10       1         4         5

which should then end up as:

  moose     loose     hoose    mean 
    2        3         8       4.3
    10       1         4        5

How can I do this in R?

Upvotes: 0

Views: 957

Answers (3)

Rich Scriven
Rich Scriven

Reputation: 99321

You could also use within, which allows you to assign/remove columns and then returns the transformed data. Start with df,

> df
#  moose loose hoose
#1     2     3     8
#2     1     3     4
#3     5     4     2
#4    10     1     4

> within(d <- df[rowMeans(df) > 4, ], { means <- round(rowMeans(d), 1) })
#  moose loose hoose means
#1     2     3     8   4.3
#4    10     1     4   5.0

Upvotes: 0

akrun
akrun

Reputation: 886938

 dat2 <- subset(transform(dat1, Mean=round(rowMeans(dat1),1)), Mean >=4)
 dat2
  # moose loose hoose Mean
 #1     2     3     8  4.3
 #4    10     1     4  5.0

Using data.table

 setDT(dat1)[, Mean:=rowMeans(.SD)][Mean>=4]
 #  moose loose hoose     Mean
 #1:     2     3     8 4.333333
 #2:    10     1     4 5.000000

Upvotes: 2

ilir
ilir

Reputation: 3224

I will assume your data is called d. Then you run:

d$mean <- rowMeans(d)  ## create a new column with the mean of each row
d <- d[d$mean >= 4, ]  ## filter the data using this column in the condition

I suggest you read about creating variables in a data.frame, and filtering data. These are very common operations that you can use in many many contexts.

Upvotes: 1

Related Questions