meto
meto

Reputation: 3719

Pandas: add crosstab totals

How can I add to my crosstab an additional row and an additional column for the totals?

df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
ct = pd.crosstab(new.A, new.B)
ct

enter image description here

I thought I would add the new column (obtained by summing over the rows) by

ct["Total"] = ct.0 + ct.1

but this does not work.

Upvotes: 11

Views: 16446

Answers (3)

Shiny
Shiny

Reputation: 67

You should use the margins=True for this along with crosstab. That should do the job!

Upvotes: 2

Ida
Ida

Reputation: 2999

In fact pandas.crosstab already provides an option margins, which does exactly what you want.

> df = pd.DataFrame({"A": np.random.randint(0,2,100), "B" : np.random.randint(0,2,100)})
> pd.crosstab(df.A, df.B, margins=True)
B     0   1  All
A               
0    26  21   47
1    25  28   53
All  51  49  100

Basically, by setting margins=True, the resulting frequency table will add an "All" column and an "All" row that compute the subtotals.

Upvotes: 45

joris
joris

Reputation: 139172

This is because 'attribute-like' column access does not work with integer column names. Using the standard indexing:

In [122]: ct["Total"] = ct[0] + ct[1]

In [123]: ct
Out[123]:
B   0   1  Total
A
0  26  24     50
1  30  20     50

See the warnings at the end of this section in the docs: http://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

When you want to work with the rows, you can use .loc:

In [126]: ct.loc["Total"] = ct.loc[0] + ct.loc[1]

In this case ct.loc["Total"] is equivalent to ct.loc["Total", :]

Upvotes: 3

Related Questions