Ziqun Liu
Ziqun Liu

Reputation: 43

Why is the output of df.where() different from df.loc[]?

I am trying to find Total not less than 580 in the table Pokemon

import numpy as np #<1>
import pandas as pd #<2>
Pokemon = pd.read_csv('data/Pokemon.csv') #<3>

Pokemon.where(Pokemon['Total']>=580.).dropna().shape #<4>
Pokemon.loc[Pokemon['Total']>=580].shape #<5>

Line 4 outputs (78, 13) while line 5 gives (113, 13). What seems to be the problem?

enter image description hereThe table is attached in this image

Upvotes: 0

Views: 51

Answers (1)

mujjiga
mujjiga

Reputation: 16876

Pokemon.where(Pokemon['Total']>=580.).dropna().shape

After finding all the rows whose Total >= 580 it drops the rows which have NaN values.

Pokemon.loc[Pokemon['Total']>=580].shape

It finds all the rows whose Total >= 580

So if there are NaNs in the table, first one will have less rows compared to second.

Upvotes: 2

Related Questions