sums22
sums22

Reputation: 2033

Calculating the range of values in a Pandas DataFrame using groupby function

I have a dataset which has feature 'abdomcirc' that has multiple values per ChildID, like so:

    ChildID     abdomcirc
0   1           273
1   1           267
2   1           294
3   2           136
4   2           248

I want to calculate the range of values for a given a list of abdomcirc values per child id. So I want to get these results:

    ChildID     range
0   1           27
1   2           112

So I first tried this:

df["range"] = df.groupby('ChildID')["mussabdomcirc"].transform('range')

But I got this error ValueError: 'range' is not a valid function name for transform(name)

So, as suggested in the answer to this question, I tried the following line:

df["range"] = df.groupby('ChildID').apply(lambda x: x.High.max() - x.Low.min())

But I got this error: AttributeError: 'DataFrame' object has no attribute 'High'

Not sure why I am getting this error. Any suggestion on how to successfully calculate the range of a group of values in a dataframe?

Upvotes: 0

Views: 732

Answers (2)

Akhilesh_IN
Akhilesh_IN

Reputation: 1317

High is not in df, please change High with your column

df.groupby("ChildID").apply(lambda x: x['abdomcirc'].max() - x['abdomcirc'].min())

Upvotes: 1

BENY
BENY

Reputation: 323226

There is one function from numpy.ptp

s=df.groupby('ChildID')['abdomcirc'].apply(np.ptp).to_frame('range').reset_index()
Out[75]: 
   ChildID  range
0        1     27
1        2    112

Fix your code

df.groupby('ChildID').apply(lambda x: x.abdomcirc.max() - x.abdomcirc.min())

Upvotes: 2

Related Questions