De-aggregate values to unique rows using Python

Question

I have a dataset where I would like to de aggregate the values into their own unique rows as well as perform a pivot.

Data

ID  type    Q1 24   Q2 24   
AA  hello   1       1   
AA  hi      2       1   
AA  bye     1       0   
AA  ok      0       1   
BB  hello   1       0   
BB  hi      1       0   
BB  bye     0       1   
BB  ok      1       1

Desired

ID  date    type
AA  Q1 24   hello
AA  Q1 24   hi
AA  Q1 24   hi
AA  Q1 24   bye
AA  Q2 24   hello
AA  Q2 24   hi
AA  Q2 24   ok
BB  Q1 24   hello
BB  Q1 24   hi
BB  Q1 24   ok
BB  Q1 24   bye
BB  Q2 24   ok

Doing

import pandas as pd
import janitor

df.pivot_longer(
    index = 'ID', 
    names_to = ('date', '.value'), 
    names_pattern = r"(Q\d+)_?(.+)",
    sort_by_appearance = True)

I am currently researching, any suggestion is appreciated.

Chris · Accepted Answer

IIUC, one way using pandas.Index.repeat with pandas.concat:

df = df.set_index(["ID", "type"])
new_df = pd.concat([pd.Series(c, index=df.index.repeat(df[c]))
                    for c in df]).reset_index(name="date")

# Optionally, sort values
new_df = new_df.sort_values(["ID", "date"], ignore_index=True)

Output:

    ID   type   date
0   AA  hello  Q1 24
1   AA     hi  Q1 24
2   AA     hi  Q1 24
3   AA    bye  Q1 24
4   AA  hello  Q2 24
5   AA     hi  Q2 24
6   AA     ok  Q2 24
7   BB  hello  Q1 24
8   BB     hi  Q1 24
9   BB     ok  Q1 24
10  BB    bye  Q2 24
11  BB     ok  Q2 24

De-aggregate values to unique rows using Python

Answers (2)

Related Questions