Reputation: 23
This question is an additional question of my previous question that I posted. What I would like to do is to replace DataFrame's string value to its first initial string. For example,
s = pd.DataFrame({'A':['S12','S1','E53',np.NaN], 'B':[1,2,3,4]})
s.A.fillna('P', inplace=True)
This will give me a Dataframe
A B
0 S12 1
1 S1 2
2 E53 3
3 P 4
But then, I would like to change the string values of column 'A' to ['S', 'S', 'E', 'P'], which is their first character. What I did is following,
for i, row in s.iterrows():
if len(row['A']) > 1:
s['A'][i] = row['A'][0]
and I got this warning.
/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:3: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas- docs/stable/indexing.html#indexing-view-versus-copy
app.launch_new_instance()
/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:7: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
I understand that this is a non-preferred way, but what exactly I am doing inefficiently and what would be the preferred way? Is it possible to do it without converting them to numpy array?
Thank you!
Upvotes: 1
Views: 55
Reputation: 153460
You are getting the 'SettingwithCopyWarning' because of the way you are assigning back to your dataframe. If you wish to use your "non-preferred" way, then you can avoid this warning message by using .loc
:
for i, row in s.iterrows():
if len(row['A']) > 1:
s.loc[i,'A'] = row['A'][0]
Output:
A B
0 S 1
1 S 2
2 E 3
3 P 4
NOTE: You can get more info on index chaining in the Pandas docs here. Also, there are some good SO posts on 'SettingWithCopyWarning'.
Upvotes: 0
Reputation: 721
You can use apply method to trim text in each row. Also to not get SettingWithCopyWarning:
warning you have to use .loc
and copy()
s = s.copy()
s.loc[:,"A"] = s.A.apply(lambda x : x[0])
print(s)
A B
0 S 1
1 S 2
2 E 3
3 P 4
Upvotes: 0
Reputation: 862851
You can use fillna
with indexing with str by str[0]
:
s['A'] = s['A'].fillna('P').str[0]
print (s)
A B
0 S 1
1 S 2
2 E 3
3 P 4
Upvotes: 1