How to flatten lists in Pandas Dataframe

Question

Hi I have a DataFrame like this It is a table of a sales information with row index of product brand and column index of Price,Week and Timestamp.

timeperiod = pd.date_range(start='4/15/2019', periods=3,dtype='datetime64[ns]', freq='D')
df = pd.DataFrame({'Price':[[2000,2000,2000],[1000,1000,1000]],'Week':[[0,0,1],[0,0,1]],
                   'Timestamp': [timeperiod,timeperiod]},index = ['Apple','Huawei'])

The output of the code above is:

         Price              Timestamp                                         Week
Apple   [2000, 2000, 2000]  DatetimeIndex(['2019-04-15', '2019-04-16', '20...   [0, 0, 1]
Huawei  [1000, 1000, 1000]  DatetimeIndex(['2019-04-15', '2019-04-16', '20...   [0, 0, 1]

Now I want to flatten the dataframe to three columns [Price,Timestamp and Week ]with index of a series of number [0,1,2] (since i got 3 element in list), and store in Two dataframes,which were named after the original dataframe index, Apple and Huawei.

so the outcome should be

Apple = pd.DataFrame({'Price':[2000,2000,2000],'Week':[0,0,1],
                   'Timestamp': timeperiod})
Huawei = pd.DataFrame({'Price':[1000,1000,1000],'Week':[0,0,1],
                   'Timestamp': timeperiod})

Apple:
   Price  Timestamp  Week
0   2000 2019-04-15     0
1   2000 2019-04-16     0
2   2000 2019-04-17     1

Huawei:
   Price  Timestamp  Week
0   1000 2019-04-15     0
1   1000 2019-04-16     0
2   1000 2019-04-17     1

Erfan · Accepted Answer

Using this function from other answer, we can unnest your columns one by one and concatenating them together again:

df = pd.concat([explode_list(df, col)[col] for col in df.columns], axis=1)

Output:

        Price  Week  Timestamp
Apple    2000     0 2019-04-15
Apple    2000     0 2019-04-16
Apple    2000     1 2019-04-17
Huawei   1000     0 2019-04-15
Huawei   1000     0 2019-04-16
Huawei   1000     1 2019-04-17

Finally if you want seperate dataframes for each unique index, we can use groupby:

dfs = [d for _, d in df.groupby(df.index)]

dfs[0]
print('
')
dfs[1]

Output:

       Price  Week  Timestamp
Apple   2000     0 2019-04-15
Apple   2000     0 2019-04-16
Apple   2000     1 2019-04-17

        Price  Week  Timestamp
Huawei   1000     0 2019-04-15
Huawei   1000     0 2019-04-16
Huawei   1000     1 2019-04-17

Function used from linked answer:

def explode_list(df, col):
    s = df[col]
    i = np.arange(len(s)).repeat(s.str.len())
    return df.iloc[i].assign(**{col: np.concatenate(s)})

How to flatten lists in Pandas Dataframe

Answers (2)

Related Questions