Reputation: 1
I built a data frame on python using an inputed SQL query. Afer this I name my columns and make sure it's nice to isolate columns with NaN values :
cursor.execute(raw_input("Enter your SQL query: "))
records = cursor.fetchall()
import pandas as pd
dframesql = pd.DataFrame(records)
dframesql.columns = [i[0] for i in cursor.description]
The problem comes after when I want to compare the number of rows with data with the total number of rows in the data frame :
dframelines = len(dframesql)
dframedesc = pd.DataFrame(dframesql.count())
When I try to compare dframedesc with dframelines, I get an error
nancol = []
for line in dframedesc:
if dframedesc < dframelines:
nancol.append(line)
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Thanks in advance !
Upvotes: 0
Views: 7082
Reputation: 7822
If you want to do it with a forloop, loop through the df's index:
nancol = []
for index in dframedesc.index:
if dframedesc.loc[index,'a_column'] < dframelines:
nancol.append(dframedesc.loc[index,:])
But why not just:
dframedesc[dframedesc['col_to_compare'] < dframelines]
Upvotes: 1