Reputation: 5660
I have a pandas
DataFrame
that I want to separate into observations for which there are no missing values and observations with missing values. I can use dropna()
to get rows without missing values. Is there any analog to get rows with missing values?
#Example DataFrame
import pandas as pd
df = pd.DataFrame({'col1': [1,np.nan,3,4,5],'col2': [6,7,np.nan,9,10],})
#Get observations without missing values
df.dropna()
Upvotes: 18
Views: 14186
Reputation: 35
I use the following expression as the opposite of dropna. In this case, it keeps rows based on the specified column that are null. Anything with a value is not kept.
csv_df = csv_df.loc[~csv_df['Column_name'].notna(), :]
Upvotes: 2
Reputation: 323366
~
= Opposite :-)
df.loc[~df.index.isin(df.dropna().index)]
Out[234]:
col1 col2
1 NaN 7.0
2 3.0 NaN
Or
df.loc[df.index.difference(df.dropna().index)]
Out[235]:
col1 col2
1 NaN 7.0
2 3.0 NaN
Upvotes: 9
Reputation: 215117
Check null
by row and filter with boolean indexing:
df[df.isnull().any(1)]
# col1 col2
#1 NaN 7.0
#2 3.0 NaN
Upvotes: 33