Reputation: 91
I am trying to concatenate the values in a cell with values in its preceding cell on the same row i.e. one column before it throughout my dataframe. For sure, the first column values wont have anything to concatenate with. Also, my df has NaN values - which I have changed to None.
Any help would be appreciated.
Thanks in advance.
Upvotes: 1
Views: 92
Reputation: 262624
Using a simple loop to keep vectorial efficiency:
df2 = df.copy()
for i in range(1, df.shape[1]):
df2.iloc[:, i] = df2.iloc[:, i-1]+'_'+df2.iloc[:, i]
output:
l0 l1 l2 l3 l4
0 a a_b a_b_c a_b_c_d a_b_c_d_e
1 a a_e a_e_f NaN NaN
2 a a_g a_g_h a_g_h_i NaN
3 b b_j b_j_k b_j_k_l b_j_k_l_m
Upvotes: 1
Reputation: 3883
# Constructing the dataframe:
df = pd.DataFrame({'l0': list('aaab'),
'l1': list('begj'),
'l2': list('cfhk'),
'l3': ['d', np.nan, 'i', 'l'],
'l4': ['e', np.nan, np.nan, 'm']})
I am iterating through the columns one by one, using pandas.Series.str.cat
, and replacing them in the original dataframe:
prev = df.iloc[:, 0]
for col in df.columns[1:]:
prev = prev.str.cat(df[col], sep='_')
df[col] = prev
Upvotes: 2
Reputation: 323396
Try with add
then cumsum
out = df.add('_').apply(lambda x : x[x.notna()].cumsum().str[:-1],axis=1)
Out[871]:
1 2 3 4 5
0 a a_b a_b_c a_b_c_d a_b_c_d_e
1 a a_e a_e_f NaN NaN
Upvotes: 2