user3476463
user3476463

Reputation: 4575

merge column values into list

I have the code below where I am trying to merge all values in a column into a list in one cell of a dataframe, like the example 'output df' below. I have a source dataframe like the 'df' dataframe below. This code isn't exactly accomplishing what I want, and it seems kind of clunky. Does anyone know a better way say with pandas?

code:

corrLst=[df[df[x]!=''][x].tolist() for x in df.columns.tolist()]
corrdict=dict(zip(df.columns.tolist(),corrLst))

sample df:

field1  field2
'a' 'b'
    'c'
'd' 
'e' 'f'

output df:

field1  field2
['a','d','e'] ['b','c','f']

Upvotes: 2

Views: 67

Answers (1)

anky
anky

Reputation: 75080

Replace the space by np.nan, then do:

[sorted(list(set(i))) for i in df.ffill().values.T.tolist()]

So in total just do:

pd.DataFrame([[sorted(list(set(i))) for i in df.ffill().values.T.tolist()]],\
                                                           columns=df.columns)

      field1     field2
0  [a, d, e]  [b, c, f]

Or in order to maintain the original order of the list, use:

from collections import OrderedDict
pd.DataFrame([[list(OrderedDict.fromkeys(i)) for i in df.ffill().values.T.tolist()]],\
                                                            columns=df.columns)

            field1           field2
0  ['a', 'd', 'e']  ['b', 'c', 'f']

Upvotes: 1

Related Questions