aaron02
aaron02

Reputation: 338

How to pivot table with unique names and avoid null values?

I have a Pandas dataframe below which when I pivot on a column with unique values (of mixed datatypes), looks like below.

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: df = pd.DataFrame({'A' : ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight'],'B' : [0.815, 0.765, 'two', 'four', 0.981, 'six', 'seven', 'eight']})
In [4]: df
Out[4]: 
       A      B
0    one  0.815
1    two  0.765
2  three    two
3   four   four
4   five  0.981
5    six    six
6  seven  seven
7  eight  eight

In [5]: df.pivot(columns='A', values='B')
Out[5]: 
A  eight   five  four    one  seven   six three    two
0   None   None  None  0.815   None  None  None   None
1   None   None  None   None   None  None  None  0.765
2   None   None  None   None   None  None   two   None
3   None   None  four   None   None  None  None   None
4   None  0.981  None   None   None  None  None   None
5   None   None  None   None   None   six  None   None
6   None   None  None   None  seven  None  None   None
7  eight   None  None   None   None  None  None   None

Is there a way to pivot this dataframe on the column "A" into a single row without the 'None' values?

Upvotes: 2

Views: 355

Answers (2)

Ted Petrou
Ted Petrou

Reputation: 62017

I think you are suggesting transposing the dataframe.

df.set_index('A').T

Upvotes: 3

spicypumpkin
spicypumpkin

Reputation: 1219

This is happening because the script looks for the values in column B and whenever it doesn't find one it just returns None. As you can see, (7, eight) = eight, which is the 8th row of the B column, (4, five) = 0.981 which is the 5th row of the B column, etc. What you really want here is transpose().

import pandas as pd
>>> df
       A      B
0    one  0.815
1    two  0.765
2  three    two
3   four   four
4   five  0.981
5    six    six
6  seven  seven
7  eight  eight

>>> df_t = df.transpose()
>>> df_t
       0      1      2     3      4    5      6      7
A    one    two  three  four   five  six  seven  eight
B  0.815  0.765    two  four  0.981  six  seven  eight

If you want to change the column header, you can do the following:

df_t.column = df_t.iloc[0] #replaces the column headers with the first row
df_t = df_t[1:] #deletes the first line so you don't see a duplicate

>>> df_t
A    one    two three  four   five  six  seven  eight
B  0.815  0.765   two  four  0.981  six  seven  eight

Upvotes: 0

Related Questions