bnassler
bnassler

Reputation: 651

Find Row where two columns are minimum

I want to find the rows where for the first time value < 0.
I tried with grouping, but then of course the columns are independent

Data:

Date          Value      Name
2017-01-01    0          BlueCar
2017-01-02    4          BlueCar
2017-01-03    4          BlueCar
2017-01-04    0          BlueCar
2017-01-01    4          RedCar
2017-01-02    0          RedCar
2017-01-03    1          RedCar

Desired result:

Date          Value      Name
2017-01-01    0          BlueCar
2017-01-02    0          RedCar

Note that BlueCar has two rows with Value: 0. In my frame that happens a couple times. I want to find the row where that happens for the first time for every name

Update: Has to be robust against unordered Date. Of course you could order beforehand but - you know ;-) were lazy

Upvotes: 2

Views: 64

Answers (3)

akrun
akrun

Reputation: 887118

We convert the 'data.frame' to 'data.table' (setDT(df1)), order by 'Name', 'Date, grouped by 'Name' and specifying the logical condition (Value <=0 in 'i', get the first row (head(.SD, 1)) and finally the set the column order as that of original dataset with setcolorder

library(data.table)
setcolorder(setDT(df1)[order(Name, Date)][Value <=0, head(.SD, 1), Name], names(df1))[]
#         Date Value    Name
#1: 2017-01-01     0 BlueCar
#2: 2017-01-02     0  RedCar

Or a base R only solution

df1[with(df1, tapply(seq_along(Value) *(Value <= 0), Name,
          FUN = function(x) x[x!=0][1]) ),]
#        Date Value    Name
#1 2017-01-01     0 BlueCar
#6 2017-01-02     0  RedCar

Upvotes: 0

Paul Hiemstra
Paul Hiemstra

Reputation: 60934

The following solution uses dplyr:

mtcars %>% group_by(Name) %>% filter(Value <= 0) %>% top_n(1, Date)

where group_by obviously splits up the data by groups, and top_n get's back the first element. In your case you need this to be Date probably.

Upvotes: 1

Spacedman
Spacedman

Reputation: 94182

Find all the zero values, then take the first of each name by using !duplicated:

> d0 = data[data$Value<=0,]
> d0[!duplicated(d0$Name),]
        Date Value    Name
1 2017-01-01     0 BlueCar
6 2017-01-02     0  RedCar

Uses only base packages and not dplyr, data table or anything else.

Upvotes: 1

Related Questions