Reputation: 1683
I have a data frame I would like to aggregate, removing the rows not NA (or choosing unique rows) in the column I would like use to aggregate
i.e in the following I may want to remove every row in the data frame for which week has NA, and keep the others unmodified:
OTHER_REV month quarter year week date days daysinmonth
1 2785013 1 2009 Q1 2009 2009-01-05 2009-01-05 2009-01-05 31
2 2785013 1 2009 Q1 2009 2009-01-12 2009-01-05 2009-01-05 31
3 2785013 1 2009 Q1 2009 2009-01-19 2009-01-05 2009-01-05 31
4 2785013 1 2009 Q1 2009 2009-01-26 2009-01-05 2009-01-05 31
5 2785013 1 NA QNA 2009 <NA> 2009-01-16 2009-01-16 31
6 2785013 1 NA QNA 2009 <NA> 2009-01-17 2009-01-17 31
Producing:
OTHER_REV month quarter year week date days daysinmonth
1 2785013 1 2009 Q1 2009 2009-01-05 2009-01-05 2009-01-05 31
2 2785013 1 2009 Q1 2009 2009-01-12 2009-01-05 2009-01-05 31
3 2785013 1 2009 Q1 2009 2009-01-19 2009-01-05 2009-01-05 31
4 2785013 1 2009 Q1 2009 2009-01-26 2009-01-05 2009-01-05 31
I have tried using a combo of grep and unique(data$stuff), and using aggregate but neither of these approaches seemed to work.
The following is str of the data:
'data.frame': 1896 obs. of 34 variables:
$ OTHER_REV : num 2785013 2785013 2785013 2785013 2785013 ...
$ month : num 1 1 1 1 1 1 1 1 1 1 ...
$ quarter :Class 'yearqtr' num [1:1896] 2009 2009 2009 2009 NA ...
$ year : num 2009 2009 2009 2009 2009 ...
$ week : Date, format: "2009-01-05" "2009-01-12" "2009-01-19" "2009-01-26" ...
$ date : Date, format: "2009-01-05" "2009-01-05" "2009-01-05" "2009-01-05" ...
$ days : Date, format: "2009-01-05" "2009-01-05" "2009-01-05" "2009-01-05" ...
$ daysinmonth : int 31 31 31 31 31 31 31 31 31 31 ...
Calling unique on df$week produces:
[1] "2009-01-05" "2009-01-12" "2009-01-19" "2009-01-26" NA "2009-02-02"......
Upvotes: 0
Views: 164
Reputation: 2361
Try this:
data[ ! is.na(data$week), ]
The similar answer using a data.table is a little simpler:
data[ ! is.na(week) ]
Upvotes: 6