Emmanuel
Emmanuel

Reputation: 19

Filtering data based on a condition

I have the following data frame:

mydata <- data.frame(Farmer = c(1,2,3,4,5,6,7,8),
              Farmer_Year = seq(2009,2016,1),
              Total_Output = c(560,290,458,612,450,380,500,290),
              Vegetable_Out = c(354,120,330,260,380,2020,357,95))

I want to select only the farmers whose Vegetable output exceeds 60% of the total output. How do I do this with R?

Upvotes: 1

Views: 150

Answers (4)

OverflowStacker
OverflowStacker

Reputation: 1338

You can try this ARRAY-FORMULA: CTRL + SHIFT + ENTER

=IFERROR(INDEX($A$1:$D$11,SMALL(IF($D$2:$D$11/$C$2:$C$11>0.6,ROW($A$2:$A$11)-1),ROW(A2)),COLUMN(A1)),"")

enter image description here

UPDATE:

This post was previously tagged EXCEL-Formula

Upvotes: 1

Choc_waffles
Choc_waffles

Reputation: 537

Please provide the code to reproduce the example above. Here is the base code without loading any library

Farmer <-  c(1, 2, 3, 4, 5, 6, 7, 8)

year <-  c(2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016)

`Total output` <- c(560, 290, 458, 612, 445, 380, 500, 290)

`Vegetable Out` <- c(334, 120, 330, 260, 380, 200, 357, 95)
df <- data.frame(Farmer, year, `Total output`, `Vegetable Out`)

df[df$Vegetable.Out / df$Total.output >= 0.6, ]

Results

  Farmer year Total.output Vegetable.Out
3      3 2011          458           330
5      5 2013          445           380
7      7 2015          500           357

Upvotes: 0

B. Christian Kamgang
B. Christian Kamgang

Reputation: 6529

Using data.table package

library(data.table)
setDT(dt)
dt[, .SD[`Vegetable Out` / `Total output` > 0.6]]

Upvotes: 0

Biblot
Biblot

Reputation: 705

I believe this would work, using the dplyr package.

library(dplyr)

mydata %>% 
  filter(`Vegetable Out` / `Total output` > 0.6)

In the future, please read how to create a minimal reproductible example by sharing your data in a form that is directly usable in R so that it is easier to help you.

Also, it would be useful to read the dplyr documentation, since subsets are a very basic operation on data frames.

Upvotes: 1

Related Questions