RikkiS
RikkiS

Reputation: 91

Concatenate values in a dataframe with value in preceding column on same row - Python

I am trying to concatenate the values in a cell with values in its preceding cell on the same row i.e. one column before it throughout my dataframe. For sure, the first column values wont have anything to concatenate with. Also, my df has NaN values - which I have changed to None.

enter image description here

Any help would be appreciated.

Thanks in advance.

Upvotes: 1

Views: 92

Answers (3)

mozway
mozway

Reputation: 262624

Using a simple loop to keep vectorial efficiency:

df2 = df.copy()
for i in range(1, df.shape[1]):
    df2.iloc[:, i] = df2.iloc[:, i-1]+'_'+df2.iloc[:, i]

output:

  l0   l1     l2       l3         l4
0  a  a_b  a_b_c  a_b_c_d  a_b_c_d_e
1  a  a_e  a_e_f      NaN        NaN
2  a  a_g  a_g_h  a_g_h_i        NaN
3  b  b_j  b_j_k  b_j_k_l  b_j_k_l_m

Upvotes: 1

Vladimir Fokow
Vladimir Fokow

Reputation: 3883

# Constructing the dataframe:
df = pd.DataFrame({'l0': list('aaab'), 
                   'l1': list('begj'),
                   'l2': list('cfhk'),
                   'l3': ['d', np.nan, 'i', 'l'],
                   'l4': ['e', np.nan, np.nan, 'm']})

I am iterating through the columns one by one, using pandas.Series.str.cat, and replacing them in the original dataframe:

prev = df.iloc[:, 0]

for col in df.columns[1:]:
    prev = prev.str.cat(df[col], sep='_')
    df[col] = prev

Upvotes: 2

BENY
BENY

Reputation: 323396

Try with add then cumsum

out = df.add('_').apply(lambda x : x[x.notna()].cumsum().str[:-1],axis=1)
Out[871]: 
   1    2      3        4          5
0  a  a_b  a_b_c  a_b_c_d  a_b_c_d_e
1  a  a_e  a_e_f      NaN        NaN

Upvotes: 2

Related Questions