How to get all column's first values of consecutive groups and also max of different bins of the respective group in pandas dataframe?

Question

I'm consecutively grouping by 'Name' column for total counts, consecutive counts & for 'Age' column I'm applying min, max to generate dataframe as : groupby & min, max, total count, cons count

Then, I'm getting only first value of every column for each consecutive group as : first values

Then, I'm trying to get all column values where max 'Age' between 5-20 from each consecutive group is present and then, I'm trying to concat this dataframe with the dataframe which has first values. But I got the output as : concat df

But the expected output is :

expected output

Also, this is for a single bin i.e., 5-20, how to include for more than 1 bins, for example, if 1 bin is 5-20 & next bin is 25-40, the expected output is : more bins output

For above outputs, this is the code what I have written :

import numpy as np
import pandas as pd

# initialize list of lists
data = [['tom', 10], ['tom', 5], ['nick', 15], ['juli', 14], ['tom', 20],['tom', 10], ['tom', 10], ['juli', 17], ['tom', 30], ['nick', 19], ['juli', 24], ['juli', 29],['tom', 0], ['juli', 76]]

# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])


# print dataframe.
print("df = ",df)
print("")

# acquire min, max, count, consecutive same names
df['min'] = df.groupby(df['Name'].ne(df['Name'].shift()).cumsum())['Age'].transform('min')#df.groupby("Name",sort=False)['Age'].transform('min')
df['max'] = df.groupby(df['Name'].ne(df['Name'].shift()).cumsum())['Age'].transform('max')#df.groupby("Name",sort=False)['Age'].transform('max')
df['count'] = df.groupby("Name",sort=False)['Name'].transform('count')
df['cons'] = df.groupby(df['Name'].ne(df['Name'].shift()).cumsum())['Name'].transform('size')

print(df)

# take the first column values of every consecutive group
df_t = df
temp_df=df.groupby(df['Name'].ne(df['Name'].shift()).cumsum(),as_index=False)[df.columns].agg('first')

print("")
print("temp_df = ",temp_df)

df_t = df_t.reset_index()
df_t = df_t.drop(['index'], axis=1)
print("df_t = ", df_t)

# check max of bin 5-20 for every consecutive group
df_t1 = df_t.groupby(df_t['Name'].ne(df_t['Name'].shift()).cumsum(),as_index=False).apply(lambda  x:x['Age'][(x['Age'] >= 5) & (x['Age'] < 20)].agg(lambda y : y.idxmax()))
print("")
print("df_t1 = ", df_t1)

# checking for condition if value is np array
a = df_t1.tolist()
b=[]
c = np.array([2])
c = c.astype('int64')
for i in a:
    if type(i)== type(c[0]):
        b.append(i)
    else:
        continue

df_t1 = df_t.iloc[b]
print("")
print("output df_t1 = ", df_t1)

# concat the bin max and first value df
concatdf = pd.concat([temp_df, df_t1],axis=1)
print("")
print("concatdf = ", concatdf)

Thank you in advance :)

How to get all column's first values of consecutive groups and also max of different bins of the respective group in pandas dataframe?

Answers (1)

Related Questions

How to get all column&#39;s first values of consecutive groups and also max of different bins of the respective group in pandas dataframe?

Answers (1)

Related Questions

How to get all column's first values of consecutive groups and also max of different bins of the respective group in pandas dataframe?