Reputation: 383
I have a dataframe that looks something like
date id params
123 2016-03-02 0A122B 23.7
124 2016-03-03 0A122B 25.5
125 2016-03-04 0A122B 29.7
126 2016-03-07 0A122B 26.4
...
456 2016-03-02 3B778C 1050
457 2016-03-03 3B778C 1350
458 2016-03-04 3B778C 2900
...
1255 2016-03-02 5D898F 135.88
1256 2016-03-03 5D898F 189.55
1257 2016-03-04 5D898F 205.22
1258 2016-03-07 5D898F 278.35
1259 2016-03-08 5D898F 145.64
For a particular unique id
, it has rows of date
and also its params
. Note that the length of amount of rows of id
can be different. For example, 0A122B
may only have date
data of length 48 and 5D898F
maybe instead data of length 1255
.
I'd like to know a way to remove rows of data where for a particular id
e.g.0A122B
, its total amount of rows is less than a number, say 50, for each and everyone of the id
.
Upvotes: 0
Views: 49
Reputation: 23146
Try with groupby
:
output = df[df.groupby("id")["date"].transform("count")>50]
Upvotes: 1