Reputation: 383
Considering I have a pandas dataframe as the following example (with more rows/columns in the real dataset):
| t1 | val1 | val2 | val3 | val4
------------------------------------
0| 1 | 1 | NaN | NaN | NaN
1| 2 | 12 | 5 | NaN | 4
2| 3 | 104 | 6 | NaN | NaN
3| 4 | -1 | 7 | 6 | NaN
4| 1 | -3 | 8 | 7 | 10
I would like to extract only the rows where t1 == 1 and val2, val3 and val4 are NaN values and only some of the columns.
For instance, in the dataframe above I would like to get only the first row.
So far I have tried the following and some variations of it with no lack:
I have defined a list of labels for the columns I am interested in:
labels = [ 't1', 'val2', 'val3', 'val4']
Then I run the following code to get all values with t1 == 1 and only the specified columns.
df2 = df.loc[df.t1 == 1, labels]
Afterwards I am trying to get only the rows that val2, val3 and val4 are NaN at the same time. I have written the following code but it does not seem to work:
df3 = df2.loc[df2[labels].isnull() == True, labels]
But it returns the following error:
ValueError: Cannot index with multidimensional key
Do you know what is wrong? Or another way of getting the results I would like to?
Thanks in advance.
Upvotes: 0
Views: 177
Reputation: 323276
You should using all
df2[df2[['val2','val3','val4']].isnull().all(1)]
Out[544]:
t1 val2 val3 val4
0 1 NaN NaN NaN
Upvotes: 2