Reputation: 4546
I have a Pandas
Dataframe
where different (numeric) columns have a different number of data points and the index is a time series. I'd like to return a new DataFrame
of rows only where my two columns of interest both have values. I've tried using Boolean indexing but the new DataFrame
doesn't contain any values, implying there are no matches. However, this isn't the case.
This is the code I tried, it doesn't produce any errors but the resulting DataFrame
is empty:
sve2_all.resample('D', how='mean')
sve2_hg = sve2_all[(sve2_all['Rim_GWT'] == True) & (sve2_all[' Q l/s'] == True)]
sve2_hg.describe()
Upvotes: 1
Views: 177
Reputation: 375445
Using == True
does not check for "having values" but that those values have the value True (which is the same as 1). That is, you're looking at only those rows where sve2_all['Rim_GWT'] == 1.0
and sve2_all[' Q l/s'] == 1.0
(it's not so surprising that this would be no rows).
Perhaps you want to check for not being NaN using pd.notnull:
sve2_all[sve2_all['Rim_GWT'].notnull() & sve2_all[' Q l/s'].notnull()]
Upvotes: 3