Reputation: 2747
I have a pandas DataFrame df1 with the following content:
Serial N year current
B 10 14
B 10 16
B 11 10
B 11
B 11 15
C 12 11
C 9
C 12 13
C 12
I would like to make a DataFrame that is based on df1
but that has any row containing an empty value removed. For example:
Serial N year current
B 10 14
B 10 16
B 11 10
B 11 15
C 12 11
C 12 13
I tried something like this
df1=df[~np.isnan(df["year"]) or ~np.isnan(df["current"])]
But I received the following error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What could be the problem?
Upvotes: 1
Views: 54
Reputation: 2572
Please try with bitwise operator |
instead, like this:
df1=df[ (~np.isnan(df["year"])) | (~np.isnan(df["current"]))]
Using dropna()
, as suggested by EdChum, is likely the cleanest and neatest solution here. You can read more about this or working with missing data generally here
Upvotes: 2
Reputation: 394041
You can just call dropna
to achieve this:
df1 = df.dropna()
As to why what you tried failed or
operator doesn't understand what it should do when comparing array like structures as it is ambiguous if 1 or more elements meet the boolean criteria, you should use the bitwise operators &
, |
and ~
for and
, or
and not
repsectively. Additionally for multiple conditions you need to wrap the conditions in parentheses due to operator precedence.
In [4]:
df.dropna()
Out[4]:
Serial N year current
0 B 10 14
1 B 10 16
2 B 11 10
4 B 11 15
5 C 12 11
7 C 12 13
Upvotes: 2
Reputation: 210842
if you really have empty cells instead of NaN's:
In [122]: df
Out[122]:
Serial_N year current
0 B 10.0 14.0
1 B 10.0 16.0
2 B 11.0 10.0
3 B 11.0
4 B 11.0 15.0
5 C 12.0 11.0
6 C 9.0
7 C 12.0 13.0
8 C 12.0
In [123]: a.replace('', np.nan).dropna()
Out[123]:
Serial_N year current
0 B 10.0 14.0
1 B 10.0 16.0
2 B 11.0 10.0
4 B 11.0 15.0
5 C 12.0 11.0
7 C 12.0 13.0
Upvotes: 2