How to create new dataframe from intervals of the dataframe and map column values to it?

Question

I have a dataframe as follows:

data = {
    'Sample': ["036C", "023C", "001C1", "022C", "042C", "007C", "009C", "004C", "030C", "003C", "010C", "034C", "044C", "013C"],
    'Aware': [75,49,59,75,59,55,82,52,55,54,81,81,49,55],
    'Social': [76,65,88,72,78,76,81,65,72,77,81,81,76,77],
    'SRS': [81,81,83,80,80,81,85,88,86,80,81,81,90,90],
    'Moti': [87,70,82,88,80,80,89,70,88,90,90,90,90,90],
    'Manner': [53,58,76,78,80,80,88,90,90,90,90,90,90,90],
    'Total': [79,73,84,82,82,80,80,82,83,78,90,90,90,90]
}
df = pd.DataFrame(data)

df

    Sample  Aware   Social  SRS Moti    Manner  Total
0   036C    75      76      81  87      53      79
1   023C    49      65      81  70      58      73
2   001C1   59      88      83  82      76      84
3   022C    75      72      80  88      78      82
4   042C    59      78      80  80      80      82
5   007C    55      76      81  80      80      80
6   009C    82      81      85  89      88      80
7   004C    52      65      88  70      90      82
8   030C    55      72      86  88      90      83
9   003C    54      77      80  90      90      78
10  010C    81      81      81  90      90      90
11  034C    81      81      81  90      90      90
12  044C    49      76      90  90      90      90
13  013C    55      77      90  90      90      90

I want to create a new dataframe with specific interval <50 50-59 60-69 70-79 80-89 >90 and map df['Sample'] to that: the expected output is:

            <50         50-59                                 60-69   70-79        80-89            >=90
Awareness   023C,044C   004C,003C,007C,030C,013C,042C,001C1   0       022C,036C    010C,034C,009C   0

for all column values.

Shubham Sharma · Accepted Answer

Melt the dataframe, then categorize the value column into discrete intervals using pd.cut, then pivot using aggfunc join to reshape and aggregate the values in Sample

s = df.melt('Sample')
s['value'] = pd.cut(s['value'], bins=np.r_[-np.inf, 50:100:10, np.inf], right=False)
s = s.pivot_table('Sample', 'variable', 'value', aggfunc=', '.join).fillna(0)

Result

variable	[-inf, 50.0)	[50.0, 60.0)	[60.0, 70.0)	[70.0, 80.0)	[80.0, 90.0)	[90.0, inf)
Aware	023C, 044C	001C1, 042C, 007C, 004C, 030C, 003C, 013C	0	036C, 022C	009C, 010C, 034C	0
Manner	0	036C, 023C	0	001C1, 022C	042C, 007C, 009C	004C, 030C, 003C, 010C, 034C, 044C, 013C
Moti	0	0	0	023C, 004C	036C, 001C1, 022C, 042C, 007C, 009C, 030C	003C, 010C, 034C, 044C, 013C
SRS	0	0	0	0	036C, 023C, 001C1, 022C, 042C, 007C, 009C, 004C, 030C, 003C, 010C, 034C	044C, 013C
Social	0	0	023C, 004C	036C, 022C, 042C, 007C, 030C, 003C, 044C, 013C	001C1, 009C, 010C, 034C	0
Total	0	0	0	036C, 023C, 003C	001C1, 022C, 042C, 007C, 009C, 004C, 030C	010C, 034C, 044C, 013C

How to create new dataframe from intervals of the dataframe and map column values to it?

Answers (1)

Related Questions