Reputation: 2072
This is the data frame that I have:
A B C D F E
2013-01-01 0.000000 0.000000 0.100928 5 NaN 1
2013-01-02 0.640525 0.220630 1.070226 5 1 1
2013-01-03 -0.963793 -0.476044 -0.581649 5 2 NaN
2013-01-04 0.882686 -0.371904 -1.320758 5 3 NaN
2013-01-05 0.021979 0.680987 -0.605329 5 4 NaN
2013-01-06 -0.238726 -0.487410 -0.383292 5 5 NaN
I then run the following code: df1.dropna(how='any')
, where df1
is the above data frame. When I look at df1
afterwards, this is what I get.
A B C D F E
2013-01-01 0.000000 0.000000 0.100928 5 NaN NaN
2013-01-02 0.640525 0.220630 1.070226 5 1 NaN
2013-01-03 -0.963793 -0.476044 -0.581649 5 2 NaN
2013-01-04 0.882686 -0.371904 -1.320758 5 3 NaN
I thought that dropna
drops any row that has a NaN
value in it. Therefore, I was expecting it to return just this:
A B C D F E
2013-01-02 0.640525 0.220630 1.070226 5 1 1
Why isn't that the case?
EDIT: Here is the code
This is what I start with:
dates = pd.date_range('20130101',periods=6)
df = pd.DataFrame(np.random.randn(6,4),index=dates,columns = list('ABCD'))
then I do this:
s1 = pd.Series([1,2,3,4,5,6],index=pd.date_range('20130102',periods=6))
df['F'] = s1
df.at[dates[0],'A'] = 0
df.iat[0,1] = 0
df.loc[:,'D'] = np.array([5]*len(df))
df1 = df.reindex(index=dates[0:4], columns = list(df.columns) + ['E'])
df1.loc[dates[0]:dates[1],'E'] = 1
and then I run the dropna
Upvotes: 2
Views: 1264
Reputation: 184
dropna
returns a new DataFrame
. Therefore to get the result you are looking for you must add
df2 = df1.dropna(how='any');
Now df2
holds the desired output. If you want df1
to have thr result, use:
df1.dropna(how='any', inplace=True)
which modifies df1
inplace.
Hope this helps!
Upvotes: 2