Reputation: 1
From the file output_hc_v1.csv with headers :emplid pay_status count location deptid grade_desc date version
I would like via python save a new csv filtered on "pay_status=Active". with this code:
a1=a.query ('"pay_status" == ["Active"]')
=========================================
but this part is not working.
My full code:
import csv
import pandas as pd
import numpy as np
df= pd.read_csv ("C:/Users/e1087353/Documents/CAPCO/HC/python_data/output_hc_v1.csv")
df.head()
a= pd.pivot_table(df, index =["location","deptid","grade_desc","pay_status"] ,values=["Count"],aggfunc=np.sum)
a1=a.query ('"pay_status" == ["Active"]')
a1.to_csv ("C:/Users/e1087353/Documents/CAPCO/HC/python_data/Output_final2.csv")
Upvotes: 0
Views: 1907
Reputation: 881037
You do not need to quote column names in the query string. Therefore you could use,
a1 = a.query('pay_status == "Active"')
Upvotes: 1
Reputation: 6597
Don't use query. To filter simply do:
a1=a[a['pay_status']=='Active']
If you create a pivot table you may want to reset the index before you filter:
a=a.reset_index()
Alternatively, for more complicated queries, if you want to filter for items in a list use isin
:
filter_list=['Active','Inactive','Something']
a1=a[a['pay_status'].isin(filter_list)]
Upvotes: 0