Reputation:
I'm trying to publish a series of cross tabs in pandas, in the context of a jupyter notebook, like so:
def crosstab_all(dataset,attributelist):
for k in attributelist:
print('k',k)
pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")
attributelist=["has_closing_date","has_address",'has_price','has_listing_date','has_contract_dates','has_tsp','has_susan','has_sell_side','has_buy_side','has_both_sides','has_beth','has_agent','has_admin','has_closing_tsp','has_key_stages']
crosstab_all(dataset,attributelist)
I find that if I just do:
k="has_closing_date"
pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")
... it will work. The issue seems to be running successive crosstab function calls. So for example having two crosstab commands in immediate succession will fail. I suspect that the problem is not the crosstab command as such but rather some extra step I need to spawn multiple jupyter windows.
Anyway, I appreciate any suggestions as to how to make this work.
Upvotes: 0
Views: 818
Reputation:
OK, I found something that works. This solution doesn't generate separate windows, and you lose some formatting, but I learned that crosstab returns a dataframe, so you can just print that, like so:
def crosstab_all(dataset,attributelist):
for k in attributelist:
xdf=pd.crosstab(dataset[k],dataset["successfulmatch"], normalize=True, margins=True, margins_name="Total")
print('xdf',xdf)
print('') # for spacing
attributelist=["has_closing_date","has_address",'has_price','has_listing_date','has_contract_dates','has_tsp','has_susan','has_sell_side','has_buy_side','has_both_sides','has_beth','has_agent','has_admin','has_closing_tsp','has_key_stages']
crosstab_all(dataset,attributelist) # dataset is a dataframe
This will return you an unformatted xtab with each pass of the loop.
Upvotes: 1