Jason
Jason

Reputation: 4546

Why isn't this pandas Boolean indexing code on numeric columns working?

I have a Pandas Dataframe where different (numeric) columns have a different number of data points and the index is a time series. I'd like to return a new DataFrame of rows only where my two columns of interest both have values. I've tried using Boolean indexing but the new DataFrame doesn't contain any values, implying there are no matches. However, this isn't the case.

This is the code I tried, it doesn't produce any errors but the resulting DataFrame is empty:

sve2_all.resample('D', how='mean')
sve2_hg = sve2_all[(sve2_all['Rim_GWT'] == True) & (sve2_all[' Q l/s'] == True)]
sve2_hg.describe()

Upvotes: 1

Views: 177

Answers (1)

Andy Hayden
Andy Hayden

Reputation: 375445

Using == True does not check for "having values" but that those values have the value True (which is the same as 1). That is, you're looking at only those rows where sve2_all['Rim_GWT'] == 1.0 and sve2_all[' Q l/s'] == 1.0 (it's not so surprising that this would be no rows).

Perhaps you want to check for not being NaN using pd.notnull:

sve2_all[sve2_all['Rim_GWT'].notnull() & sve2_all[' Q l/s'].notnull()]

Upvotes: 3

Related Questions