Reputation: 26392
If I have two columns in a Pandas Dataframe and I want to perform an assertion to see if they're equal-to or greater than or some other logical boolean test on the two columns.
Right now I'm doing something like this:
# Roll the fields up so we can compare both reports.
# Goal: Show that `Gross Sales per Bar` is equal to `Gross Sales per Category`
#
# Do a GROUP BY of all the service bars and sum their Gross Sales per Bar
# Since the same value should be in this field for every 'Gross Sales per Bar' field,
# grab the first one, so we can compare them below
df_bar_sum = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Bar'].first()
df_bar_sum2 = sbbac.groupby(['Bar'], as_index=False)['Gross Sales per Category'].sum()
# Rename the 'Gross Sales per Category' column to 'Summed Gross Sales per Category'
df_bar_sum2.rename(columns={'Gross Sales per Category':'Summed Gross Sales per Category'}, inplace=True)
# Add the 'Gross Sales per Bar' column to the df_bar_sum2 Data Frame.
df_bar_sum2['Gross Sales per Bar'] = df_bar_sum['Gross Sales per Bar']
# See if they match...they should since the value of 'Gross Sales per Bar' should be equal to 'Gross Sales per Category' summed.
df_bar_sum2['GrossSalesPerCat_GrossSalesPerBar_eq'] = df_bar_sum2.apply(lambda row: 1 if row['Summed Gross Sales per Category'] == row['Gross Sales per Bar'] else 0, axis=1)
# Print the result
df_bar_sum2
And I just end up with a column that equals 1
if it matches and 0
if it doesn't.
I'd like to use assertions here to test if they match or not, since that'll cause the whole thing to crap out when doing tests if they don't match with some sort of an error displayed; maybe that's not a good way to do it for tabular data, I'm not sure, but if it is a good idea, I'd rather use assertions to compare them.
It may also be harder to read with assertions, which would be bad, I'm not sure...
Upvotes: 0
Views: 1358
Reputation: 5331
assert np.allclose(your_df['Summed Gross Sales per Category'],
your_df['Gross Sales per Bar'])
Upvotes: 1