Reputation:
I've removed all NaN from a df using df = df.fillna(0)
.
After I create a pivot table using
pd.pivot_table(df, index='Source', columns='Customer Location', values='Total billed £')
I still get NaN
data again as output.
Could someone explain me why and how to prevent this output and why this is happening?
Upvotes: 4
Views: 1790
Reputation: 862741
Because of your input data, it converts one column to index and the values of another one to columns. The intersection of these are the aggregated values.
But if some combinations do not exist in the input data, these will result into missing data (NaN
).
df = pd.DataFrame({
'Source':list('abcdef'),
'Total billed £':[5,3,6,9,2,4],
'Customer Location':list('adfbbb')
})
print (df)
Source Total billed £ Customer Location
0 a 5 a
1 b 3 d
2 c 6 f
3 d 9 b
4 e 2 b
5 f 4 b
#e.g because `Source=a` and `Customer Location=b` not exist in source then NaN in output
print (pd.pivot_table(df,index='Source', columns='Customer Location',values='Total billed £'))
Customer Location a b d f
Source
a 5.0 NaN NaN NaN
b NaN NaN 3.0 NaN
c NaN NaN NaN 6.0
d NaN 9.0 NaN NaN
e NaN 2.0 NaN NaN
f NaN 4.0 NaN NaN
Furthermore, here's a good read on reshaping data
.
Upvotes: 3
Reputation: 61910
The reason is simple there is a pair of (index, column) values that is missing from your data, for example:
df = pd.DataFrame({"Source": ["foo", "bar", "bar", "bar"],
"Customer Location": ["one", "one", "two", "two", ],
"Total billed £": [10, 20, 30, 40]})
print(df)
Setup
Source Customer Location Total billed £
0 foo one 10
1 bar one 20
2 bar two 30
3 bar two 40
As you can see there is no ('foo', 'two') pair in your data, so when you do:
result = pd.pivot_table(df, index='Source', columns='Customer Location', values='Total billed £')
print(result)
Output
Customer Location one two
Source
bar 20.0 35.0
foo 10.0 NaN
To fix the problem provide a default value using the fill_value parameter:
result = pd.pivot_table(df, index='Source', columns='Customer Location', values='Total billed £', fill_value=0)
Output
Customer Location one two
Source
bar 20 35
foo 10 0
Upvotes: 2