Pandas: How can I flatten only the lists of list for a specific column?

Question

I have a dataset that has a particular column having values similar to the dummy data frame below with column col2. The column entries are either a list or a list of lists and I want to flatten only the list of lists to a single list.

    col1                  col2
0    tom                  [10]
1   nick              [15, 24]
2   juli  [[16, 14], [19, 17]]
3  harry              [23, 15]
4  frank  [[15, 16], [50, 30]]

I want my expected dataframe to resemble something like this -

col1              col2
0    tom              [10]
1   nick          [15, 24]
2   juli  [16, 14, 19, 17]
3  harry          [23, 15]
4  frank   [15, 16, 50, 3]

I tried using DF['col2'] = DF.col2.apply(lambda x: sum(x, [])) but it didn't work returning the error - TypeError: can only concatenate list (not "str") to list

How can I solve this elegantly?

SeaBean · Accepted Answer

You can use np.ravel, as follows:

df['col2'] = df['col2'].map(np.ravel)

Note that this assumed your list are real list instead of string looking like list. If not the case, you can convert the string to real list first, as follows:

import ast
df['col2'] = df['col2'].apply(ast.literal_eval)

# Then, run the code:
df['col2'] = df['col2'].map(np.ravel)

Result:

print(df)

    col1              col2
0    tom              [10]
1   nick          [15, 24]
2   juli  [16, 14, 19, 17]
3  harry          [23, 15]
4  frank  [15, 16, 50, 30]

Pandas: How can I flatten only the lists of list for a specific column?

Answers (1)

Related Questions