Reputation:
Is there any way we could extract values from describe function in python ?
for an example i have this table
basicprofiling = pdf.toPandas().describe()
product Bs_type country period table_name
count 200 200 200 200 200
unique 2 1 1 1 2
top Deposits Retail vietnam daily animal
freq 100 200 200 200 100
lets say i would want to extract and print product count,unique no and total type. is this achievable ?
this is what i tried basicprofiling.select('prf_product') but it is returning error on str not callable
Upvotes: 2
Views: 2996
Reputation: 3631
Describe returns a DataFrame where the summary names are the index, so you can access all the counts (for example) using loc
, like this:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), columns=['a', 'b', 'c'])
data = df.describe()
data.loc['count']
And individual values like this:
data.loc["count","a"]
Upvotes: 1
Reputation: 403
If you run this line, you will see that
print(type(df.describe()))
actually returns a dataframe:
<class 'pandas.core.frame.DataFrame'>
So you can access values inside like you would a regular dataframe:
df = pd.DataFrame({"a":[1,4],"b":[2,1],"c":[7,9],"d":[1,3]})
print(df.describe()['a']['count'])
The output will be: 2.0
Upvotes: 0
Reputation: 39072
You can use to_frame()
to get a DataFrame from the Series (output of describe
) and then .T
to tranaform the Series indices to column names. Then you can simply access the values you want. For example
s = pd.Series(['a', 'a', 'b', 'c'])
basicprofiling = s.describe().to_frame().T
print (basicprofiling['count'], basicprofiling['unique'])
# 4, 3
Upvotes: 0