Reputation: 3
I have a problem using Pandas.
When I execute autos.info()
it returns:
RangeIndex: 371528 entries, 0 to 371527
Data columns (total 20 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 dateCrawled 371528 non-null object
1 name 371528 non-null object
2 seller 371528 non-null object
3 offerType 371528 non-null object
4 price 371528 non-null int64
5 abtest 371528 non-null object
6 vehicleType 333659 non-null object
7 yearOfRegistration 371528 non-null int64
8 gearbox 351319 non-null object
9 powerPS 371528 non-null int64
10 model 351044 non-null object
11 kilometer 371528 non-null int64
12 monthOfRegistration 371528 non-null int64
13 fuelType 338142 non-null object
14 brand 371528 non-null object
15 notRepairedDamage 299468 non-null object
16 dateCreated 371528 non-null object
17 nrOfPictures 371528 non-null int64
18 postalCode 371528 non-null int64
19 lastSeen 371528 non-null object
dtypes: int64(7), object(13)
memory usage: 56.7+ MB
But when I execute autos["price"].describe()
it returns:
count 3.715280e+05
mean 1.729514e+04
std 3.587954e+06
min 0.000000e+00
25% 1.150000e+03
50% 2.950000e+03
75% 7.200000e+03
max 2.147484e+09
Name: price, dtype: float64
I don't understand why there is this type incongruence between the type of the column price.
Any suggestions?
Upvotes: 0
Views: 298
Reputation: 2887
The return value of Series.describe()
is a Series with the descriptive statistics. The dtype
you see in the Series is not the dtype
of the original column but the dtype
of the statistics - which is float
.
The name
of the result is price
because that is set as the name of the Series autos["price"]
.
Upvotes: 1
Reputation: 35115
If I control the number of display digits, will I get the data I want?
pd.set_option('display.float_format', lambda x: '%.5f' % x)
df['X'].describe().apply("{0:.5f}".format)
Upvotes: 0