Reputation: 1477
I want to fetch only the text around parenthesis and keep this text in the same column.
I have the following dataframe df:
id feature
1 mutation(MI:0118)
2 mutation(MI:0119)
3 mutation(MI:01120)
The expected output is:
id feature
1 MI:0118
2 MI:0119
3 MI:01120
I tried the following regex but it is not allowing me to copy it to the same column.
df['feature'] = df['feature'].str.extract(r"\((.*?)\)", expand=False)
I am getting following warning and the above code is converting all the values in the feature column to NaN
/home/lib/python2.7/site-packages/ipykernel_launcher.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
"""Entry point for launching an IPython kernel.
Thanks
Upvotes: 1
Views: 102
Reputation: 71600
Try using the below code with a different pattern:
df['feature'] = df['feature'].str.extract('.*\((.*)\).*', expand=False)
print(df)
Output:
id feature
0 1 MI:0118
1 2 MI:0119
2 3 MI:01120
Upvotes: 1