Reputation: 817
I am having data cleaning troubles on what should be a straightforward thing and would really appreciate some help.
I have a df with dates in the first column and different categories (Red, Yellow, Orange) as column names. The remainder of the df are numbers. I would like to search through all elements of the df (other than the first column); if an element is greater than a threshold level (i >0.2, for instance), I would like to return the date and the column name. So ideally, my output would be an nx2 df like this:
Is there a clear best method to use this: I have read about which[] and subset() and %in% and have come away without a clear answer.
Thanks again.
Upvotes: 0
Views: 75
Reputation: 887901
May be
library(reshape2)
subset(melt(df, id.var='date'), value > 0.2, select=1:2)
Or using dplyr/tidyr
library(dplyr)
library(tidyr)
gather(df, Var, Val,-date) %>%
filter(Val > 0.2) %>%
select(-Val)
df <- structure(list(date = c("November, 2003", "October, 1997",
"September, 2005"), Red = c(0.1, 0.2, 0.3), Yellow = c(0.3, 0.4, 0.2),
Orange = c(0.1, 0.2, 0.4)), .Names = c("date", "Red", "Yellow", "Orange"),
row.names = c(NA, -3L), class = "data.frame")
Upvotes: 1