Reputation: 315

Flatten nested lists with variable length in pandas column

I have nested lists in a pandas column and i want to flatten them.

df5 = pd.DataFrame({'text':[[['some','string'],['yes']],[['hello','how','are','u'],['fine','thanks']]],
               'names':[[['chris'],['peter','kate']],[['steve','john'],['kyle','eric']]]})

The problem here seems to be that the lists all vary in length, otherwise it could be easily solved with .apply(np.ravel)

Upvotes: 0

Answers (2)

jezrael

Reputation: 862741

Use DataFrame.applymap for processes element wise values with list comprehension and flattening:

cols = ['text','names']
df5[cols] = df5[cols].applymap(lambda x: [z for y in x for z in y])
print (df5)
                                 text                      names
0                 [some, string, yes]       [chris, peter, kate]
1  [hello, how, are, u, fine, thanks]  [steve, john, kyle, eric]

Or:

cols = ['text','names']
df5[cols] = df5[cols].applymap(np.concatenate)
print (df5)
                                 text                      names
0                 [some, string, yes]       [chris, peter, kate]
1  [hello, how, are, u, fine, thanks]  [steve, john, kyle, eric]

Upvotes: 2

Rakesh

Reputation: 82765

Looks like you need itertools.chain and applymap

Ex:

from itertools import chain
df5 = pd.DataFrame({'text':[[['some','string'],['yes']],[['hello','how','are','u'],['fine','thanks']]],
               'names':[[['chris'],['peter','kate']],[['steve','john'],['kyle','eric']]]})

print(df5.applymap(lambda x: list(chain.from_iterable(x))))

Output:

                       names                                text
0       [chris, peter, kate]                 [some, string, yes]
1  [steve, john, kyle, eric]  [hello, how, are, u, fine, thanks]

Upvotes: 0

Flatten nested lists with variable length in pandas column

Answers (2)

Related Questions