Reputation: 59
I have a column in DataFrame called both_ntf
like this :
column1
411.1
104.5-105.6
167.3-166.9
254
399
373.5
My expected result is :
column1 column2 column3
411.1 411.1 NaN
104.5-105.6 104.5 105.6
167.3-166.9 167.3 166.9
254 254 NaN
399 399 NaN
The if
statement I did seems not working.
if '-' in both_ntf['column1']:
print("if")
rng_ntf = both_ntf[both_ntf['column1'].str.contains("-", na=False)]
rng_ntf[['column2','column3']] =rng_ntf.column1.str.split("-",expand=True)
#Add
filtered_ntf = rng_ntf
elif '-' not in both_ntf['column1']:
print("elif")
nrng_ntf = both_ntf[~both_ntf['column1'].str.contains("-", na=False)]
nrng_ntf['column2'] = nrng_ntf['column1']
filtered_ntf = filtered_ntf.append(nrng_ntf, sort=True)
As you can see, rng_ntf
and nrng_ntf
are temporary dataframe and then, appended to a new dataframe filtered_ntf
. I'm hoping to do this more effective and faster.
Upvotes: 0
Views: 58
Reputation: 13349
Try:
pd.concat([df, df.column1.str.split('-', expand=True)], axis=1)
column1 0 1
0 411.1 411.1 None
1 104.5-105.6 104.5 105.6
2 167.3sb-166.9 167.3sb 166.9
3 254 254 None
4 399 399 None
5 373.5 373.5 None
you can assign the column names also.
split_df = df.column1.str.split('-', expand=True)
split_df.columns=['column2', 'column3']
pd.concat([df, split_df], axis=1)
Upvotes: 1
Reputation: 11643
This should work but I haven't tested it:
def split_values(x, col, i, sep='-'):
items = x[col].split(sep)
try:
return items[i]
except:
return None
df['column2'] = df.apply(split_values, axis=1, args=("column1", 0))
df['column3'] = df.apply(split_values, axis=1, args=("column1", 1))
Upvotes: 1