BGG16
BGG16

Reputation: 536

How to add a row to a Python Pandas dataframe that was generated using the .describe() function

Per the code below, I have created a df with a single column (labeled 'A') of numerical data.

However, when I do the steps described above per the code below, a new row is added to df_stats (which I want), but a new column is also added to df_stats (which I do not want). This new column includes NaN values for all rows, other than the final (new) 'percent_positive' row, which includes the 'Percent_Positive' value I calculated previously.

Can someone please tell me how to populate df_stats with the Percent_Positive value without adding a new column to df_stats? Thank you.

# Import dependencies 
import pandas as pd
import numpy as np

# Create df with randomly populated numbers.
df = pd.DataFrame(np.random.randint(-80,100,size=(100, 1)), columns=list('A'))

# Use the .describe function to calculate basic stats on the df.
df_stats = df.describe()

# Create new var to calculate the percentage of time a value in column A is positive.
Percent_Positive = df_stats.loc[df_stats['A'] > 0,'A'].count()/df_stats['A'].count()

# Add new row to df_stats called 'percent_positive' using the Percent_Positive var above.
df_stats = df_stats.append(pd.Series(Percent_Positive, name='percent_positive'))
display(df_stats)

Upvotes: 1

Views: 597

Answers (2)

keramat
keramat

Reputation: 4543

Use:

# Import dependencies 
import pandas as pd
import numpy as np

# Create df with randomly populated numbers.
df = pd.DataFrame(np.random.randint(-80,100,size=(100, 1)), columns=list('A'))

# Use the .describe function to calculate basic stats on the df.
df_stats = df.describe()

# Create new var to calculate the percentage of time a value in column A is positive.
Percent_Positive = df_stats.loc[df_stats['A'] > 0,'A'].count()/df_stats['A'].count()

# A DF WITH THE SAME COLS.######################################
df_stats = df_stats.append(pd.DataFrame({'A':[Percent_Positive]}, index=['percent_positive']))
display(df_stats)

The result:

enter image description here

Upvotes: 1

Joran Beasley
Joran Beasley

Reputation: 114038

stats is a DF you need to append another DF to it...

df2 = pandas.DataFrame({"value":[5]},index=['New Key'])
pandas.concat([stats,df2]))

gives

         value
count     20.00000
mean     124.70000
std       65.36545
min       26.00000
25%       78.25000
50%      120.50000
75%      161.00000
max      252.00000
New Key    5.00000

Upvotes: 1

Related Questions