JiaMing  Lin
JiaMing Lin

Reputation: 1082

Calculate mean and std using pandas in python

I got a problem when I calculate the mean and std.

I loaded an CSV via

df = pandas.read_csv("fakedata.csv", skiprows=1, header=None)

but then the method

df.mean()

gives me nothing. Here is the link of raw data.

>>> df.mean()
Series([], dtype: float64)

I have also checked the count.

>>> df.count()
0    40000
dtype: int64

My OS is Centos6.7 with python 2.7, pandas 0.17.1

pip show pandas
---
Metadata-Version: 2.0
Name: pandas
Version: 0.17.1
Summary: Powerful data structures for data analysis, time series,and statistics
Home-page: http://pandas.pydata.org
Author: The PyData Development Team
Author-email: [email protected]
License: BSD
Location: /usr/local/lib/python2.7/site-packages
Requires: pytz, python-dateutil, numpy

[Edit] The dataframe information shows

>>> df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 40000 entries, 0 to 39999
Data columns (total 1 columns):
0    40000 non-null object
dtypes: object(1)
memory usage: 625.0+ KB

and dataframe shape shows

>>> df.shape
(40000, 1)

Upvotes: 2

Views: 1245

Answers (1)

Fabio Lamanna
Fabio Lamanna

Reputation: 21552

I think the problem relies on the separator. Copying and paste your file into a .csv file, I can read it with:

df = pandas.read_csv("fakedata.csv", skiprows=1, header=None, sep='\s+')

getting as result:

In [18]: df.mean()
Out[18]: 
0     50.574475
1     49.585400
2    169.478500
3     59.544800
4    119.814275
5     79.557500
6     79.497775
dtype: float64

and:

In [19]: df.std()
Out[19]: 
0    19.787459
1    19.762996
2    14.997920
3    10.034209
4    40.013550
5    19.887973
6    14.947894
dtype: float64

Upvotes: 2

Related Questions