Frank Mazzuchelli
Frank Mazzuchelli

Reputation: 41

Pandas: change the order of the columns when using crosstab

Pretty simply, I want to change the order of the columns for Panda's crosstab.

Right now, it's in alphabetical order, i.e.: Friday, Monday, Saturday, Sunday, Thursday, Tuesday, Wednesday. I would like it to go in order, i.e.: Monday, Tuesday, ..., Sunday.

This is for a dataset where I wanted to make a crosstab for the days of the week, and the hour of an occurrence.

I'm doing this right now:

pd.crosstab(data_2019.HOUR, data_2019.DAY_OF_WEEK)

With the output looking like this:

DAY_OF_WEEK Friday  Monday  Saturday    Sunday  Thursday    Tuesday Wednesday
HOUR                            
0   204 255 256 260 225 222 192
1   121 111 198 230 116 117 145
2   128 90  217 222 84  111 96

Upvotes: 4

Views: 9607

Answers (2)

Akavall
Akavall

Reputation: 86188

It is often that one needs to change order of columns and rows, and for that we need to combine approaches outlined the in answer @edesz provided.

For example:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"a": ["one", "two", "three", "three"], "b": ["two", "one", "three", "one"]})

In [3]: pd.crosstab(df["a"], df["b"]) # wrong order
Out[3]:
b      one  three  two
a
one      0      0    1
three    1      1    0
two      1      0    0

In [4]: pd.crosstab(df["a"], df["b"]).reindex(["one", "two", "three"])[["one", "two", "three"]] # correct order
Out[4]:
b      one  two  three
a
one      0    1      0
two      1    0      0
three    1    0      1

Upvotes: 2

edesz
edesz

Reputation: 12406

You can create a list with the days of the week, in the required order. Then you can use .crosstab and change the order of the output of running .crosstab using

Generate crosstab

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday',
        'Friday', 'Saturday', 'Sunday']

c = pd.crosstab(...)

One option

Change order of columns produced by crosstab

  • this amounts to simply selecting all the columns, but using a list of weekday names in the order you required, since the crosstab output is a just a normal Pandas DataFrame
c = c[days]

Alternatively

Use .reindex with axis='columns' and specify the list (days) to use to change that index (columns) of the DataFrame

c = c.reindex(days, axis="columns")

Upvotes: 4

Related Questions