Reputation: 8025
I am trying to concat all my columns into a new column. The concatenated values should be stored in a list.
My dataframe:
df = pd.DataFrame({'A': ['1', '2', nan],
'B': [nan, '5', nan],
'C': ['7', nan, '9']})
desired output:
df:
A B C concat_col
1 nan 7 [1,7]
2 5 nan [2,5]
nan nan 9 [9]
What i tried:
df['concat'] = pd.Series(df.fillna('').values.tolist()).str.join(',')
Output i got:
A B C concat_col
1 nan 7 1,,7
2 5 nan 2,5,,
nan nan 9 ,,9
Upvotes: 1
Views: 4197
Reputation: 164673
You can use a list comprehension, taking advantage of the fact np.nan != np.nan
:
df['D'] = [[i for i in row if i == i] for row in df.values]
print(df)
A B C D
0 1 NaN 7 [1, 7]
1 2 5 NaN [2, 5]
2 NaN NaN 9 [9]
Counter-intuitively, this is more efficient than Pandas methods:
df = pd.concat([df]*10000, ignore_index=True)
%timeit df.apply(lambda row: row.dropna().tolist(), axis=1) # 8.25 s
%timeit [[i for i in row if i == i] for row in df.values] # 55.6 ms
Upvotes: 4
Reputation: 1704
The following code should work:
df['concat_col']=df.apply(lambda row: row.dropna().tolist(), axis=1)
Upvotes: 3