Reputation: 2016
Here is the dummy DataFrame:
d = {'col_1': [1, 2], 'col_n_1': [3, 4], 'col_2': [2, 1], 'col_n_2': [6, 3]}
df = pd.DataFrame(data=d)
col_1 col_2 col_n_1 col_n_2
0 1 2 3 6
1 2 1 4 3
2 1 1 4 5
I am looking for a nice way to extract the values from col_n_1
where col_1 == 1
and col_n_2
where col_2 == 1
in a new column that would look like:
new_col
3
3
4,5
Upvotes: 1
Views: 50
Reputation: 862406
Use where
for get values by mask and then join
columns together:
L = ['col_1','col_2']
L1 = ['col_n_1','col_n_2']
df['new'] = (df[L1].astype(str).where(df[L].eq(1).values, axis=1)
.apply(lambda x: ','.join(x.dropna()), 1))
Solution if only 2 columns:
L = ['col_1','col_2']
L1 = ['col_n_1','col_n_2']
df1 = df[L1].astype(str).where(df[L].eq(1).values, axis=1)
df['new'] = (df1['col_n_1'] .fillna('') + ',' + df1['col_n_2'] .fillna('')).str.strip(',')
Or solution with add ,
and then sum
, last remove trailing ,
:
df['new'] = (df[L1].astype(str).where(df[L].eq(1).values)
.add(', ')
.fillna('')
.sum(axis=1)
.str.strip(', '))
print (df)
col_1 col_2 col_n_1 col_n_2 new
0 1 2 3 6 3
1 2 1 4 3 3
2 1 1 4 5 4,5
Upvotes: 3
Reputation: 121
This can be accomplished with the apply()
method and a lambda function. apply()
with the index
parameter set to 1
will call a given function on each row of the dataframe. So the only trouble is writing that function -- I think the best solution is to create a list containing either the row's col_n_1
or col_n_2
, both, or neither, then joining the list with commas. Like this:
df['new'] = df.apply(lambda row: ','.join([str(row.col_n_1)] if row.col_1 == 1 else [] + [str(row.col_n_2)] if row.col_2 == 1 else []), axis = 1)
Upvotes: 0
Reputation: 323226
Borrow the name list from Jez
df[L].eq(1).rename(columns=dict(zip(L,L1))).mul((df[L1].astype(str)+',')).sum(1).str[:-1]
Out[126]:
0 3
1 3
2 4,5
dtype: object
Upvotes: 2