Reputation:
Here is my Python question:
I am asked to generate an output table which contains the number of Nan in each variables (there are more than 10 variables in the data), min, max, mean, std, 25%, 50%,and 70%. I used the describe function in panda to created the describe table which gave me everything i want but the number of Nan in each variables. I am thinking about adding the number of Nan as a new row into the output generated from the describe output.
Anyone can help with this?
output = input_data.describe(include=[np.number]) # this gives the table output
count_nan = input_data.isnull().sum(axis=0) # this counts the number of Nan of each variable
How can I add the second as a row into the first table?
Upvotes: 2
Views: 2042
Reputation: 880269
You could use .append
to append a new row to a DataFrame:
In [21]: output.append(pd.Series(count_nan, name='nans'))
Out[21]:
0 1 2 3 4
count 4.000000 4.000000 4.000000 4.000000 4.000000
mean 0.583707 0.578610 0.566523 0.480307 0.540259
std 0.142930 0.358793 0.309701 0.097326 0.277490
min 0.450488 0.123328 0.151346 0.381263 0.226411
25% 0.519591 0.406628 0.478343 0.406436 0.429003
50% 0.549012 0.610845 0.607350 0.478787 0.516508
75% 0.613127 0.782827 0.695530 0.552658 0.627764
max 0.786316 0.969421 0.900046 0.582391 0.901610
nans 0.000000 0.000000 0.000000 0.000000 0.000000
Upvotes: 2