user2110417
user2110417

Reputation:

How to create new dataframe from intervals of the dataframe and map column values to it?

I have a dataframe as follows:

data = {
    'Sample': ["036C", "023C", "001C1", "022C", "042C", "007C", "009C", "004C", "030C", "003C", "010C", "034C", "044C", "013C"],
    'Aware': [75,49,59,75,59,55,82,52,55,54,81,81,49,55],
    'Social': [76,65,88,72,78,76,81,65,72,77,81,81,76,77],
    'SRS': [81,81,83,80,80,81,85,88,86,80,81,81,90,90],
    'Moti': [87,70,82,88,80,80,89,70,88,90,90,90,90,90],
    'Manner': [53,58,76,78,80,80,88,90,90,90,90,90,90,90],
    'Total': [79,73,84,82,82,80,80,82,83,78,90,90,90,90]
}
df = pd.DataFrame(data)

df

    Sample  Aware   Social  SRS Moti    Manner  Total
0   036C    75      76      81  87      53      79
1   023C    49      65      81  70      58      73
2   001C1   59      88      83  82      76      84
3   022C    75      72      80  88      78      82
4   042C    59      78      80  80      80      82
5   007C    55      76      81  80      80      80
6   009C    82      81      85  89      88      80
7   004C    52      65      88  70      90      82
8   030C    55      72      86  88      90      83
9   003C    54      77      80  90      90      78
10  010C    81      81      81  90      90      90
11  034C    81      81      81  90      90      90
12  044C    49      76      90  90      90      90
13  013C    55      77      90  90      90      90

I want to create a new dataframe with specific interval <50 50-59 60-69 70-79 80-89 >90 and map df['Sample'] to that: the expected output is:

            <50         50-59                                 60-69   70-79        80-89            >=90
Awareness   023C,044C   004C,003C,007C,030C,013C,042C,001C1   0       022C,036C    010C,034C,009C   0

for all column values.

Upvotes: 0

Views: 309

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Melt the dataframe, then categorize the value column into discrete intervals using pd.cut, then pivot using aggfunc join to reshape and aggregate the values in Sample

s = df.melt('Sample')
s['value'] = pd.cut(s['value'], bins=np.r_[-np.inf, 50:100:10, np.inf], right=False)
s = s.pivot_table('Sample', 'variable', 'value', aggfunc=', '.join).fillna(0)

Result

variable [-inf, 50.0) [50.0, 60.0) [60.0, 70.0) [70.0, 80.0) [80.0, 90.0) [90.0, inf)
Aware 023C, 044C 001C1, 042C, 007C, 004C, 030C, 003C, 013C 0 036C, 022C 009C, 010C, 034C 0
Manner 0 036C, 023C 0 001C1, 022C 042C, 007C, 009C 004C, 030C, 003C, 010C, 034C, 044C, 013C
Moti 0 0 0 023C, 004C 036C, 001C1, 022C, 042C, 007C, 009C, 030C 003C, 010C, 034C, 044C, 013C
SRS 0 0 0 0 036C, 023C, 001C1, 022C, 042C, 007C, 009C, 004C, 030C, 003C, 010C, 034C 044C, 013C
Social 0 0 023C, 004C 036C, 022C, 042C, 007C, 030C, 003C, 044C, 013C 001C1, 009C, 010C, 034C 0
Total 0 0 0 036C, 023C, 003C 001C1, 022C, 042C, 007C, 009C, 004C, 030C 010C, 034C, 044C, 013C

Upvotes: 0

Related Questions