Reputation: 477
I have a script with the following code that is working as intended
Table2 = df.loc[df.Date.between('2018-11-22','2018-11-30')].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum()
However, now i am trying to add an additional criterion in order to filter on a column called "region" which has the value Canada in the dataset but it doesn't seem to be working.
Table2 = df.loc[df.Date.between('2018-11-22','2018-11-30'), df['Region'] = 'Canada'].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum()
The additional filter seems to have no impact. Can anyone help. Thanks
Upvotes: 1
Views: 156
Reputation: 1432
As mentioned by @Alaxander
Here's an example snippet:
import pandas
df = pd.DataFrame({
"A": [1,2,3],
"B": [4,5,6],
"C": [1,1,1]
})
df.loc[((d["A"]==1) & (d["B"]==4)), ["A", "B"]]
Also, you might want to look at the assignment operator used in df["Region"] = "Canada"
, shouldn't it be ==
for it to be used as a filter? I have added this as well in your code below.
Your code if you want specific fields:
Table2 = df.loc[((df.Date.between('2018-11-22','2018-11-30')) & (df['Region'] == 'Canada')), ["Date", "Region"]].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum()
Your code if you want all the fields:
Table2 = df.loc[((df.Date.between('2018-11-22','2018-11-30')) & (df['Region'] == 'Canada'))].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum()
PS: Thanks to @Alexandar for mentioning about the mistake.
Upvotes: 2
Reputation: 477
i figured this out by doing the following
Table2 = df.loc[df.Date.between('2018-11-22','2018-11-30') & (df['Region'] == 'Canada')].groupby(df['FPYear'])[['New Customer', 'Existing Customer', 'revenue']].sum() #with date filters for table
Upvotes: 0